我正在编写一个Clojure库来解析Mac OS X基于
XML的
property list files.代码工作正常,除非你给它一个大的输入文件,此时你得到java.lang.OutOfMemoryError:Java堆空间.
这是一个示例输入文件(小到可以正常工作):
<plist version="1.0"> <dict> <key>Integer example</key> <integer>5</integer> <key>Array example</key> <array> <integer>2</integer> <real>3.14159</real> </array> <key>Dictionary example</key> <dict> <key>Number</key> <integer>8675309</integer> </dict> </dict> </plist>
clojure.xml / parse将其转换为:
{:tag :plist,:attrs {:version "1.0"},:content [ {:tag :dict,:attrs nil,:content [ {:tag :key,:content ["Integer example"]} {:tag :integer,:content ["5"]} {:tag :key,:content ["Array example"]} {:tag :array,:content [ {:tag :integer,:content ["2"]} {:tag :real,:content ["3.14159"]} ]} {:tag :key,:content ["Dictionary example"]} {:tag :dict,:content [ {:tag :key,:content ["Number"]} {:tag :integer,:content ["8675309"]} ]} ]} ]}
我的代码将其转换为Clojure数据结构
{"Dictionary example" {"Number" 8675309},"Array example" [2 3.14159],"Integer example" 5}
我的代码的相关部分看起来像
; extract the content contained within e.g. <integer>...</integer> (defn- first-content [c] (first (c :content))) ; return a parsed version of the given tag (defmulti content (fn [c] (c :tag))) (defmethod content :array [c] (apply vector (for [item (c :content)] (content item)))) (defmethod content :dict [c] (apply hash-map (for [item (c :content)] (content item)))) (defmethod content :integer [c] (Long. (first-content c))) (defmethod content :key [c] (first-content c)) (defmethod content :real [c] (Double. (first-content c))) ; take a java.io.File (or similar) and return the parsed version (defn parse-plist [source] (content (first-content (clojure.xml/parse source))))
代码的内容是内容函数,一种调用:标记(XML标记的名称)的多方法.我想知道是否有一些不同我应该做的,以使这个递归更好地工作.我尝试用蹦床内容替换所有三个内容调用,但这不起作用.我是否应该做些什么来使这种相互递归更有效地工作?或者我采取了根本错误的做法?
编辑:顺便说一句,这段代码是available on GitHub,在这种形式下,它可能更容易玩.