主要是想记录一下for child in root.iter():这句的作用:
输入
<pre name="code" class="python">C:\Users\jeguan\Desktop\Test_2.xml
<Response Status="OKAY" CongLvl="LEVEL0" OverallProvTime="4026852" TimeInReqQueue="228" DbCommitTime="6371" RequestId="100000"> <CapacityParms><Category>RESIDENTIALSUBSCRIBER_R2</Category><FeatureSetName>R1 FEATURE SET</FeatureSetName><OfficeId>ylvJrPbcGgHE</OfficeId><CurrentCnt>0</CurrentCnt><LimitCnt>0000050</LimitCnt><SpareCapacity>0</SpareCapacity><TasUnequalDistribution>0</TasUnequalDistribution> </CapacityParms></Response>
代码为:
############################################################################ # The same to re_testsearch(),the difference is ET.fromstring(str1) is # used,that is,there is no need to save the matched "restult" into a file # we can analize the content of the "result" directly. #refer-to:https://docs.python.org/2/library/xml.etree.elementtree.html?highlight=elementtree ############################################################################ def re_testsearch2(): from xml.etree import ElementTree as ET filename = r'C:\Users\jeguan\Desktop\Test_2.xml' open_file = open(filename,'r') read_file = open_file.readlines() # re.S means: Make the '.' special character match any character at all,# including a newline; without this flag,'.' will match anything except a newline. # '(.+?)' means: this is a lazzy match. When the fist 'Response>' is found,then # it will not try to match the next 'Response>' re_patt = re.compile(r'<Response Status="OKAY" CongLvl="LEVEL0"*(.+?)Response>',re.S) str1 = "" # 把读出的行放在str1中 for line in read_file: str1 = str1 + line # re_patt.search() returns an object for MatchObject; # "result" is a string. result = re_patt.search(str1).group(0) # This code only used to make it more clear that "result" is used as a tree here. tree = result root = ET.fromstring(tree) #print(root.tag) #print(root.attrib) # Element has some useful methods that help iterate recursively over all # the sub-tree below it (its children,their children,and so on). # For example,Element.iter() # global dict_child dict_child = {} <span style="color:#CC0000;"> for child in root.iter():</span> dict_child[child.tag] = child.text # print(child.tag) # print(child.attrib) # print(child.text) print(dict_child)
结果为:
{'Category': 'RESIDENTIALSUBSCRIBER_R2','SpareCapacity': '0','LimitCnt': '0000050','FeatureSetName': 'R1 FEATURE SET','CurrentCnt': '0','TasUnequalDistribution': '0','OfficeId': 'ylvJrPbcGgHE','<span style="font-size:18px;color:#FF0000;">CapacityParms': '\n\t','Response': '\n '</span>}结果中标红的不是我不想要的,不知道如何不让他们保存在dict_child中。