[XML文档解析]libxml2对XML文件的创建、解析、查找、修改

一、创建XML文档：

我们使用xmlNewDoc()来创建XML文档，然后使用xmlNewNode(),xmlNewChild(),xmlNewProp(),xmlNewText()等函数向XML文件中添加节点及子节点，设置元素和属性，创建完毕后用xmlSaveFormatFileEnc()来保存XML文件到磁盘(该函数可以设置保存XML文件时的编码格式)。

示例1：

 #include <stdio.h> #include <libxml/parser.h> #include <libxml/tree.h> int main(int argc,char **argv) { xmlDocPtr doc = NULL; /* document pointer */ xmlNodePtr root_node = NULL,node = NULL,node1 = NULL;/* node pointers */ // Creates a new document,a node and set it as a root node doc = xmlNewDoc(BAD_CAST "1.0"); root_node = xmlNewNode(NULL,BAD_CAST "root"); xmlDocSetRootElement(doc,root_node); //creates a new node,which is "attached" as child node of root_node node. xmlNewChild(root_node,NULL,BAD_CAST "node1",BAD_CAST "content of node1"); // xmlNewProp() creates attributes,which is "attached" to an node. node=xmlNewChild(root_node,BAD_CAST "node3",BAD_CAST"node has attributes"); xmlNewProp(node,BAD_CAST "attribute",BAD_CAST "yes"); //Here goes another way to create nodes. node = xmlNewNode(NULL,BAD_CAST "node4"); node1 = xmlNewText(BAD_CAST"other way to create content"); xmlAddChild(node,node1); xmlAddChild(root_node,node); //Dumping document to stdio or file xmlSaveFormatFileEnc(argc > 1 ? argv[1] : "-",doc,"UTF-8",1); /*free the document */ xmlFreeDoc(doc); xmlCleanupParser(); xmlMemoryDump();//debug memory for regression tests return(0); }

二、解析XML文档

解析文档时仅仅需要文件名并只调用一个函数，并有错误检查，常用的相关函数有xmlParseFile(),xmlParseDoc(),获取文档指针后，就可以使用xmlDocGetRootElement()来获取根元素节点指针，利用该指针就可以在DOM树里漫游了，结束后要调用xmlFreeDoc()释放。

示例2：

 xmlDocPtr doc; //定义解析文档指针 xmlNodePtr cur; //定义结点指针(你需要它为了在各个结点间移动) xmlChar *key; doc = xmlReadFile(url,MY_ENCODING,256); //解析文件

/*检查解析文档是否成功，如果不成功，libxml将指一个注册的错误并停止。一个常见错误是不适当的编码。XML标准文档除了用UTF-8或UTF-16外还可用其它编码保存。如果文档是这样，libxml将自动地为你转换到UTF-8。更多关于XML编码信息包含在XML标准中。*/

 if (doc == NULL ) { fprintf(stderr,"Document not parsed successfully. \n"); return; } cur = xmlDocGetRootElement(doc); //确定文档根元素 /*检查确认当前文档中包含内容*/ if (cur == NULL) { fprintf(stderr,"empty document\n"); xmlFreeDoc(doc); return; }

/*在这个例子中，我们需要确认文档是正确的类型。“root”是在这个示例中使用文档的根类型。*/

 if (xmlStrcmp(cur->name,(const xmlChar *) "root")) { fprintf(stderr,"document of the wrong type,root node != root"); xmlFreeDoc(doc); return; } cur = cur->xmlChildrenNode; while(cur!=NULL) { if ((!xmlStrcmp(cur->name,(const xmlChar *)"keyword"))) { key = xmlNodeListGetString(doc,cur->xmlChildrenNode,1); printf("keyword: %s\n",key); xmlFree(key); } cur = cur->next; } xmlFreeDoc(doc);

三、查找XML节点

有时候对一个XML文档我们可能只关心其中某一个或某几个特定的Element的值或其属性，如果漫游DOM树将是很痛苦也很无聊的事，利用XPath可以非常方便地得到你想的Element。下面是一个自定义函数：

示例3：

 xmlXPathObjectPtr get_nodeset(xmlDocPtr doc,const xmlChar *xpath) { xmlXPathContextPtr context; xmlXPathObjectPtr result; context = xmlXPathNewContext(doc); if (context == NULL) { printf("context is NULL\n"); return NULL; } result = xmlXPathEvalExpression(xpath,context); xmlXPathFreeContext(context); if (result == NULL) { printf("xmlXPathEvalExpression return NULL\n"); return NULL; } if (xmlXPathNodeSetIsEmpty(result->nodesetval)) { xmlXPathFreeObject(result); printf("nodeset is empty\n"); return NULL; } return result; } 在doc指向的XML文档中查询满足xpath表达式条件的节点，返回满足这一条件的节点集合查询条件xpath的写法参见xpath相关资料。在查询完毕获取结果集后，就可以通过返回的 xmlXPathObjectPtr 结构访问该节点： 示例4： xmlChar *xpath = ("/root/node/[@key='keyword']"); xmlXPathObjectPtr app_result = get_nodeset(doc,xpath); if (app_result == NULL) { printf("app_result is NULL\n"); return; } int i = 0; xmlChar *value; if(app_result) { xmlNodeSetPtr nodeset = app_result->nodesetval; for (i=0; i < nodeset->nodeNr; i++) { cur = nodeset->nodeTab[i]; cur = cur->xmlChildrenNode; while(cur!=NULL) { value = xmlGetProp(cur,(const xmlChar *)"key"); if (value != NULL) { printf("value: %s\n\n",d_ConvertCharset("utf-8","GBK",(char *)value)); xmlFree(value); } value = xmlNodeGetContent(cur); if (value != NULL) { printf("value: %s\n\n",(char *)value)); xmlFree(value); } } } xmlXPathFreeObject (app_result); }

通过get_nodeset()返回的结果集，我们可以获取该节点的元素及属性，也可以修改该节点的值。示例中在获取值打印的时候用到 d_ConvertCharset()函数来改变编码格式为GBK，以方便正确读取可能的中文字符。

四、修改XML元素及属性等信息

要修改XML文档里的元素及属性等信息，先需要解析XML文档，获得一个节点指针(xmlNodePtr node),利用该节点指针漫游DOM树，就可以在XML文档中获取，修改，添加相关信息。

示例6：

 得到一个节点的内容： xmlChar *value = xmlNodeGetContent(node); 返回值value应该使用xmlFree(value)释放内存 得到一个节点的某属性值： xmlChar *value = xmlGetProp(node,(const xmlChar *)"prop1"); 返回值需要xmlFree(value)释放内存 设置一个节点的内容： xmlNodeSetContent(node,(const xmlChar *)"test"); 设置一个节点的某属性值： xmlSetProp(node,(const xmlChar *)"prop1",(const xmlChar *)"v1"); 添加一个节点元素： xmlNewTextChild(node,(const xmlChar *)"keyword",(const xmlChar *)"test Element"); 添加一个节点属性： xmlNewProp(node,(const xmlChar

参考：http://hi.baidu.com/valefeng/item/22199856fe25ac3694eb051d
点击打开链接

[XML文档解析]libxml2对XML文件的创建、解析、查找、修改

猜你在找的XML相关文章