Browse Source
bpo-39011: Preserve line endings within ElementTree attributes (GH-18468)
* bpo-39011: Preserve line endings within attributes
Line endings within attributes were previously normalized to "\n" in Py3.7/3.8.
This patch removes that normalization, as line endings which were
replaced by entity numbers should be preserved in original form.
pull/19489/head
mefistotelis
6 years ago
committed by
GitHub
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with
22 additions and
9 deletions
Doc/whatsnew/3.9.rst
Lib/test/test_xml_etree.py
Lib/xml/etree/ElementTree.py
Misc/NEWS.d/next/Library/2020-02-12-01-48-51.bpo-39011.hGve_t.rst
@ -412,6 +412,15 @@ customization consistently by always using the value specified by
case), and one used `` __VENV_NAME__ `` instead.
(Contributed by Brett Cannon in :issue: `37663` .)
xml
---
White space characters within attributes are now preserved when serializing
:mod: `xml.etree.ElementTree` to XML file. EOLNs are no longer normalized
to "\n". This is the result of discussion about how to interpret
section 2.11 of XML spec.
(Contributed by Mefistotelis in :issue: `39011` .)
Optimizations
=============
@ -430,13 +430,14 @@ class ElementTreeTest(unittest.TestCase):
self . assertEqual ( ET . tostring ( elem ) ,
b ' <test testa= " testval " testb= " test1 " testc= " test2 " >aa</test> ' )
# Test preserving white space chars in attributes
elem = ET . Element ( ' test ' )
elem . set ( ' a ' , ' \r ' )
elem . set ( ' b ' , ' \r \n ' )
elem . set ( ' c ' , ' \t \n \r ' )
elem . set ( ' d ' , ' \n \n ')
elem . set ( ' d ' , ' \n \n \r \r \t \t ')
self . assertEqual ( ET . tostring ( elem ) ,
b ' <test a= " 
 ; " b= " " c= " 	 
 ; " d= " " /> ' )
b ' <test a= " 
 ; " b= " " c= " 	 
 ; " d= " 		 " /> ' )
def test_makeelement ( self ) :
# Test makeelement handling.
@ -1057,15 +1057,15 @@ def _escape_attrib(text):
text = text . replace ( " > " , " > " )
if " \" " in text :
text = text . replace ( " \" " , " " " )
# The following business with carriage returns is to satisfy
# Section 2.11 of the XML specification, stating that
# CR or CR LN should be replaced with just LN
# Although section 2.11 of the XML specification states that CR or
# CR LN should be replaced with just LN, it applies only to EOLNs
# which take part of organizing file into lines. Within attributes,
# we are replacing these with entity numbers, so they do not count.
# http://www.w3.org/TR/REC-xml/#sec-line-ends
if " \r \n " in text :
text = text . replace ( " \r \n " , " \n " )
# The current solution, contained in following six lines, was
# discussed in issue 17582 and 39011.
if " \r " in text :
text = text . replace ( " \r " , " \n " )
#The following four lines are issue 17582
text = text . replace ( " \r " , " " )
if " \n " in text :
text = text . replace ( " \n " , " " )
if " \t " in text :
@ -0,0 +1,3 @@
Normalization of line endings in ElementTree attributes was removed, as line
endings which were replaced by entity numbers should be preserved in
original form.