从xml-conduit获取所有名称

我正在从 http://hackage.haskell.org/package/xml-conduit-1.1.0.9/docs/Text-XML-Stream-Parse.html解析修改后的XML

这是它的样子：

<?xml version="1.0" encoding="utf-8"?>
<population xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://example.com">
  <success>true</success>
  <row_count>2</row_count>
  <summary>
    <bananas>0</bananas>
  </summary>
  <people>
      <person>
          <firstname>Michael</firstname>
          <age>25</age>
      </person>
      <person>
          <firstname>Eliezer</firstname>
          <age>2</age>
      </person>
  </people>
</population>

如何获得每个人的名字和年龄列表？

我的目标是使用http-conduit下载这个xml,然后解析它,但我正在寻找一个解决方案,解决在没有属性时如何解析(使用tagNoAttrs？)

这是我尝试过的,我在Haskell评论中添加了我的问题：

{-# LANGUAGE OverloadedStrings #-}
import Control.Monad.Trans.Resource
import Data.Conduit (($$))
import Data.Text (Text,unpack)
import Text.XML.Stream.Parse
import Control.Applicative ((<*))

data Person = Person Int Text
        deriving Show

-- Do I need to change the lambda function \age to something else to get both name and age?
parsePerson = tagNoAttr "person" $\age -> do
        name <- content  -- How do I get age from the content?  "unpack" is for attributes
        return $Person age name

parsePeople = tagNoAttr "people" $many parsePerson

-- This doesn't ignore the xmlns attributes
parsePopulation  = tagName "population" (optionalAttr "xmlns" <* ignoreAttrs) $parsePeople

main = do
        people <- runResourceT $
             parseFile def "people2.xml" $$parsePopulation
        print people

首先：解析xml-conduit中的组合器在很长一段时间内没有更新,并显示它们的年龄.我建议大多数人使用DOM或游标界面.那就是说,让我们来看看你的例子.您的代码有两个问题：

>它无法正确处理XML名称空间.所有元素名称都在http://example.com命名空间中,您的代码需要反映这一点.
>解析组合器要求您考虑所有元素.他们不会自动跳过某些元素.

所以这是使用流API的实现,它获得了所需的结果：

{-# LANGUAGE OverloadedStrings #-}
import           Control.Monad.Trans.Resource (runResourceT)
import           Data.Conduit                 (Consumer,($$))
import           Data.Text                    (Text)
import           Data.Text.Read               (decimal)
import           Data.XML.Types               (Event)
import           Text.XML.Stream.Parse

data Person = Person Int Text
        deriving Show

-- Do I need to change the lambda function \age to something else to get both name and age?
parsePerson :: MonadThrow m => Consumer Event m (Maybe Person)
parsePerson = tagNoAttr "{http://example.com}person" $do
        name <- force "firstname tag missing" $tagNoAttr "{http://example.com}firstname" content
        ageText <- force "age tag missing" $tagNoAttr "{http://example.com}age" content
        case decimal ageText of
            Right (age,"") -> return $Person age name
            _ -> force "invalid age value" $return Nothing

parsePeople :: MonadThrow m => Consumer Event m [Person]
parsePeople = force "no people tag" $do
    _ <- tagNoAttr "{http://example.com}success" content
    _ <- tagNoAttr "{http://example.com}row_count" content
    _ <- tagNoAttr "{http://example.com}summary" $
        tagNoAttr "{http://example.com}bananas" content
    tagNoAttr "{http://example.com}people" $many parsePerson

-- This doesn't ignore the xmlns attributes
parsePopulation :: MonadThrow m => Consumer Event m [Person]
parsePopulation = force "population tag missing" $
    tagName "{http://example.com}population" ignoreAttrs $\() -> parsePeople

main :: IO ()
main = do
        people <- runResourceT $
             parseFile def "people2.xml" $$parsePopulation
        print people

这是使用游标API的示例.请注意,它具有不同的错误处理特性,但对于格式良好的输入应该产生相同的结果.

{-# LANGUAGE OverloadedStrings #-}
import Text.XML
import Text.XML.Cursor
import Data.Text (Text)
import Data.Text.Read (decimal)
import Data.Monoid (mconcat)

main :: IO ()
main = do
    doc <- Text.XML.readFile def "people2.xml"
    let cursor = fromDocument doc
    print $cursor $// element "{http://example.com}person" >=> parsePerson

data Person = Person Int Text
        deriving Show

parsePerson :: Cursor -> [Person]
parsePerson c = do
    let name = c $/ element "{http://example.com}firstname" &/ content
        ageText = c $/ element "{http://example.com}age" &/ content
    case decimal $mconcat ageText of
        Right (age,"") -> [Person age $mconcat name]
        _ -> []

从xml-conduit获取所有名称

猜你在找的XML相关文章