@H_502_3@
如果我跑:
@H_502_3@from lxml import etree html = etree.parse('text.txt') result = html.xpath('//title') print(result)
我会得到一个空列表.
我想它与命名空间有关,但我无法弄清楚如何解决它.
最佳答案
尝试使用html解析器创建树.
另请注意,如果text.txt是文件,则需要先读取它.
另请注意,如果text.txt是文件,则需要先读取它.
@H_502_3@with open('text.txt','r',encoding='utf8') as f: text_html = f.read()
像这样:
@H_502_3@from lxml import etree,html def build_lxml_tree(_html): tree = html.fromstring(_html) tree = etree.ElementTree(tree) return tree tree = build_lxml_tree(text_html) result = tree.xpath('//title') print(result)