XML解析-SAX的使用

前端之家收集整理的这篇文章主要介绍了XML解析-SAX的使用前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。

1.什么是SAX?

SAX,全称Simple API for XML,是一个用于处理XML事件驱动的“推”模型,虽然它不是W3C标准,但它却是一个得到了广泛认可的API。SAX解析器不像DOM那样建立一个完整的文档树,而是在读取文档时激活一系列事件,这些事件被推给事件处理器,然后由事件处理器提供对文档内容的访问。

事件处理器类型:

  • 用于访问XML DTD内容的DTDHandler;
  • 用于低级访问解析错误的ErrorHandler;
  • 用于访问文档内容的ContentHandler,这也是最普遍使用的事件处理器。

优势

  • 提供对XML文档内容的有效低级访问;
  • 内存消耗小,因为整个文档无需一次加载到内存中;
  • 无需像在DOM中那样为所有节点创建对象;
  • 可用于广播环境,能够同时注册多个ContentHandler,并行接收事件。

劣势

  • 必须实现多个事件处理程序以便能够处理所有到来的事件;
  • 必须在应用程序代码中维护这个事件状态;
  • 不能支持随机访问。

2.使用的SAX类:

org.xml.sax:

有如下图:


org.xml.sax.ext:


org.xml.sax.helpers

3.实例

测试用的text.xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!--  <!DOCTYPE country SYSTEM "country.dtd">  -->
<!DOCTYPE country [
    <!ELEMENT country (provinces?,states?,municipalites?)>
    <!ATTLIST country name CDATA #required>
       
    <!ELEMENT provinces (province+)>
    <!ELEMENT province (cities)>
    <!ATTLIST province name CDATA #required>
    
    <!ELEMENT cities (city+)>    
    <!ELEMENT city (#PCDATA)> 
    <!ATTLIST city name CDATA #required>     
]>
<country name="China">
    <provinces>
        <province name="GuangDong">
            <cities>
                <city name="GuangZhou">广州</city>
                <city name="ShenZhen">深圳</city>
                <city name="ZhuHai">珠海</city>
            </cities>
        </province>
        <province name="HuNan">
            <cities>
                <city name="ChangSha">长沙</city>
                <city name="HengYang">衡阳</city>
                <city name="ChangDe">常德</city>
            </cities>
        </province>
    </provinces>
</country>

定义MyContentHandler类,实现ContentHandler接口:
public static class MyContentHandler implements ContentHandler {
     private Locator locator;
     private int tentLength = 0;//此成员变量用于打印信息的缩进,以更好地观察输出内容
    @Override
    public void characters(char[] ch,int start,int length)
            throws SAXException {
        // TODO Auto-generated method stub
        //打印空格以缩进,下同
       for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("characters():\""+String.copyValueOf(ch,start,length)+"\"");
        
    }

    @Override
    public void endDocument() throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("endDocument() called");
    }

    @Override
    public void endElement(String uri,String localName,String qName)
            throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("endElement():</"+qName+">");
    }

    @Override
    public void endPrefixMapping(String prefix) throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("endPrefixMapping():"+prefix);
    }

    @Override
    public void ignorableWhitespace(char[] ch,int length)
            throws SAXException {
        // TODO Auto-generated method stub
        //System.out.println("ignorableWhitespace():"+length);
        
        tentLength = length;
        
    }

    @Override
    public void processingInstruction(String target,String data)
            throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("processingInstruction():<"+target+","+data+">");
    }

    @Override
    public void setDocumentLocator(Locator locator) {
        // TODO Auto-generated method stub
        this.locator = locator;
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("setDocumentLocator():["+locator+"]");
        
    }

    @Override
    public void skippedEntity(String name) throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("skippedEntity():"+name);
    }

    @Override
    public void startDocument() throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("startDocument() called");
    }

    @Override
    public void startElement(String uri,String qName,Attributes atts) throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("startElement():<"+qName+">");
            
    }

    @Override
    public void startPrefixMapping(String prefix,String uri)
            throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("startPrefixMapping():"+prefix);
        
    }
     
 }
首先,为了弄清楚这些方法调用顺序,我们在每个方法中将方法名和接收到的参数打印出来。以下是程序运行的main函数
public static void main(String[] args) throws SAXException,IOException{
        File srcFile = new File("./test.xml");
        XMLReader xmlReader = XMLReaderFactory.createXMLReader();
        xmlReader.setFeature("http://xml.org/sax/features/validation",false); 
        
        xmlReader.setContentHandler(new MyContentHandler());
        xmlReader.parse("./test.xml");
    }

输出结果:
setDocumentLocator():[com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$LocatorProxy@1a2961b]
startDocument() called
startElement():<country>
     startElement():<provinces>
         startElement():<province>
             startElement():<cities>
                 startElement():<city>
                 characters():"广州"
                 endElement():</city>
                 startElement():<city>
                 characters():"深圳"
                 endElement():</city>
                 startElement():<city>
                 characters():"珠海"
                 endElement():</city>
             endElement():</cities>
         endElement():</province>
         startElement():<province>
             startElement():<cities>
                 startElement():<city>
                 characters():"长沙"
                 endElement():</city>
                 startElement():<city>
                 characters():"衡阳"
                 endElement():</city>
                 startElement():<city>
                 characters():"常德"
                 endElement():</city>
             endElement():</cities>
         endElement():</province>
     endElement():</provinces>
 endElement():</country>
 endDocument() called

接下来,我们将该XML文件解析成一个Country 类对象。先定义两个类:
 public static class Province{
 public String name;
 public ArrayList<String> cities;
 
 public Province(){
 name = "";
 cities = new ArrayList<String>(5);
 }
 }
 
 public static class Country{
 public String name;
 public ArrayList<Province> provinces;
 
 public Country(){
 name = "";
 provinces = new ArrayList<Province>();
 }
 }
在MyContentHandler中声明如下成员变量:
private Country country;
private Province curProvince;
private City curCity;
private boolean isInCityElement = false;//指示当前事件处于City 元素中,用于获取城市的中文名称

由于该XML文件结构比较简单,我们只需要修改startElement()/character()/endElement()/endDocument()四个成员方法,如下:
@Override
    public void characters(char[] ch,int length)
            throws SAXException {
            if(isInCityElement&&curCity != null){//若当前处于City元素中,则获取城市中文名称
                curCity.chName = String.copyValueOf(ch,length);
            }
        
    }

    @Override
    public void endDocument() throws SAXException {
       //print Country object:将解析出来的Country类对象打印出来,验证解析是否正确
        System.out.println("country:"+country.name);
        int size = country.provinces.size();
        for(int i = 0;i<size;i++){
            Province prc = country.provinces.get(i);
            System.out.println("  |--"+prc.name);
            for(City city : prc.cities){
                System.out.println("  |    |--"+city.enName+"("+city.chName+")");
            }
        }
    }

    @Override
    public void endElement(String uri,String qName)
            throws SAXException {
        if(localName.equalsIgnoreCase("province")){
            if(curProvince != null){
                country.provinces.add(curProvince);
                curProvince = null;
            }
        }else if(localName.equalsIgnoreCase("city")){
            if(curProvince !=null&&curCity != null){
                curProvince.cities.add(curCity);
                curCity = null;
            }   
            isInCityElement = false;//在此标记已不在City元素中
       }
        
    }
   @Override
 public void startElement(String uri,Attributes atts) throws SAXException {
 isInCityElement = true;
 if(localName.equalsIgnoreCase("country")){
 country = new Country();
 country.name = atts.getValue("name");
 }else if(localName.equalsIgnoreCase("province")){
 curProvince =new Province();
 curProvince.name = atts.getValue("name");
 }else if(localName.equalsIgnoreCase("city")){
                isInCityElement = true;//在此标记进入city元素中
                curCity = new City();
 curCity.enName = atts.getValue("name");
 }
 }  
 
 输出结果: 
 
 
country:China
  |--GuangDong
  |    |--GuangZhou(广州)
  |    |--ShenZhen(深圳)
  |    |--ZhuHai(珠海)
  |--HuNan
  |    |--ChangSha(长沙)
  |    |--HengYang(衡阳)
  |    |--ChangDe(常德)

猜你在找的XML相关文章