我在循环XML文件大约20-30 MB(650000行)时遇到问题.
这是我的元代码:
<cffile action="READ" ile="file.xml" variable="usersRaw"> <cfset usersXML = XmlParse(usersRaw)> <cfset advsXML = XmlSearch(usersXML,"/advs/advuser")> <cfset users = XmlSearch(usersXML,"/advs/advuser/user")> <cfset numUsers = ArrayLen(users)> <cfloop index="i" from="1" to="#numUsers#"> ... some selects... ... insert... <cfset advs = annunciXml[i]["vehicle"]> <cfset numAdvs = ArrayLen(advs)> <cfloop index="k" from="1" to="#numAdvs#"> ... insert... or ... update... </cfloop> </cfloop>
xml文件的结构是(是的,不是很好:-)
<advs> <advuser> <user> </user> <vehicle> <vehicle> </advuser> </advs>
在大约120,000行后,我收到一个错误:“内存不足”.
如何提高脚本的性能?
如何诊断最大内存消耗量?
@SamG是正确的,ColdFusion XML解析因为DOM解析器而无法做到,但是SAX很痛苦,而是使用StAX解析器,它提供了一个更简单的迭代器接口.
See the answer to another question I provided for an example of how to do this with ColdFusion.
这大致就是你为你的例子做的事情:
<cfset fis = createObject("java","java.io.FileInputStream").init( "#getDirectoryFromPath(getCurrentTemplatePath())#/file.xml" )> <cfset bis = createObject("java","java.io.BufferedInputStream").init(fis)> <cfset XMLInputFactory = createObject("java","javax.xml.stream.XMLInputFactory").newInstance()> <cfset reader = XMLInputFactory.createXMLStreamReader(bis)> <cfloop condition="#reader.hasNext()#"> <cfset event = reader.next()> <cfif event EQ reader.START_ELEMENT> <cfswitch expression="#reader.getLocalName()#"> <cfcase value="advs"> <!--- root node,do nothing ---> </cfcase> <cfcase value="advuser"> <!--- set values used later on for inserts,selects,updates ---> </cfcase> <cfcase value="user"> <!--- some selects and insert ---> </cfcase> <cfcase value="vehicle"> <!--- insert or update ---> </cfcase> </cfswitch> </cfif> </cfloop> <cfset reader.close()>