1.什么是SAX?
SAX,全称Simple API for XML,是一个用于处理XML事件驱动的“推”模型,虽然它不是W3C标准,但它却是一个得到了广泛认可的API。SAX解析器不像DOM那样建立一个完整的文档树,而是在读取文档时激活一系列事件,这些事件被推给事件处理器,然后由事件处理器提供对文档内容的访问。
事件处理器类型:
优势
- 提供对XML文档内容的有效低级访问;
- 内存消耗小,因为整个文档无需一次加载到内存中;
- 无需像在DOM中那样为所有节点创建对象;
- 可用于广播环境,能够同时注册多个ContentHandler,并行接收事件。
劣势
2.使用的SAX类:
org.xml.sax:
有如下图:
org.xml.sax.ext:
org.xml.sax.helpers
3.实例
测试用的text.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!-- <!DOCTYPE country SYSTEM "country.dtd"> --> <!DOCTYPE country [ <!ELEMENT country (provinces?,states?,municipalites?)> <!ATTLIST country name CDATA #required> <!ELEMENT provinces (province+)> <!ELEMENT province (cities)> <!ATTLIST province name CDATA #required> <!ELEMENT cities (city+)> <!ELEMENT city (#PCDATA)> <!ATTLIST city name CDATA #required> ]> <country name="China"> <provinces> <province name="GuangDong"> <cities> <city name="GuangZhou">广州</city> <city name="ShenZhen">深圳</city> <city name="ZhuHai">珠海</city> </cities> </province> <province name="HuNan"> <cities> <city name="ChangSha">长沙</city> <city name="HengYang">衡阳</city> <city name="ChangDe">常德</city> </cities> </province> </provinces> </country>
定义MyContentHandler类,实现ContentHandler接口:
public static class MyContentHandler implements ContentHandler { private Locator locator; private int tentLength = 0;//此成员变量用于打印信息的缩进,以更好地观察输出内容 @Override public void characters(char[] ch,int start,int length) throws SAXException { // TODO Auto-generated method stub //打印空格以缩进,下同 for(int i =0;i<tentLength;i++){ System.out.print(" "); } System.out.println("characters():\""+String.copyValueOf(ch,start,length)+"\""); } @Override public void endDocument() throws SAXException { // TODO Auto-generated method stub for(int i =0;i<tentLength;i++){ System.out.print(" "); } System.out.println("endDocument() called"); } @Override public void endElement(String uri,String localName,String qName) throws SAXException { // TODO Auto-generated method stub for(int i =0;i<tentLength;i++){ System.out.print(" "); } System.out.println("endElement():</"+qName+">"); } @Override public void endPrefixMapping(String prefix) throws SAXException { // TODO Auto-generated method stub for(int i =0;i<tentLength;i++){ System.out.print(" "); } System.out.println("endPrefixMapping():"+prefix); } @Override public void ignorableWhitespace(char[] ch,int length) throws SAXException { // TODO Auto-generated method stub //System.out.println("ignorableWhitespace():"+length); tentLength = length; } @Override public void processingInstruction(String target,String data) throws SAXException { // TODO Auto-generated method stub for(int i =0;i<tentLength;i++){ System.out.print(" "); } System.out.println("processingInstruction():<"+target+","+data+">"); } @Override public void setDocumentLocator(Locator locator) { // TODO Auto-generated method stub this.locator = locator; for(int i =0;i<tentLength;i++){ System.out.print(" "); } System.out.println("setDocumentLocator():["+locator+"]"); } @Override public void skippedEntity(String name) throws SAXException { // TODO Auto-generated method stub for(int i =0;i<tentLength;i++){ System.out.print(" "); } System.out.println("skippedEntity():"+name); } @Override public void startDocument() throws SAXException { // TODO Auto-generated method stub for(int i =0;i<tentLength;i++){ System.out.print(" "); } System.out.println("startDocument() called"); } @Override public void startElement(String uri,String qName,Attributes atts) throws SAXException { // TODO Auto-generated method stub for(int i =0;i<tentLength;i++){ System.out.print(" "); } System.out.println("startElement():<"+qName+">"); } @Override public void startPrefixMapping(String prefix,String uri) throws SAXException { // TODO Auto-generated method stub for(int i =0;i<tentLength;i++){ System.out.print(" "); } System.out.println("startPrefixMapping():"+prefix); } }首先,为了弄清楚这些方法的调用顺序,我们在每个方法中将方法名和接收到的参数打印出来。以下是程序运行的main函数:
public static void main(String[] args) throws SAXException,IOException{ File srcFile = new File("./test.xml"); XMLReader xmlReader = XMLReaderFactory.createXMLReader(); xmlReader.setFeature("http://xml.org/sax/features/validation",false); xmlReader.setContentHandler(new MyContentHandler()); xmlReader.parse("./test.xml"); }
输出结果:
setDocumentLocator():[com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$LocatorProxy@1a2961b] startDocument() called startElement():<country> startElement():<provinces> startElement():<province> startElement():<cities> startElement():<city> characters():"广州" endElement():</city> startElement():<city> characters():"深圳" endElement():</city> startElement():<city> characters():"珠海" endElement():</city> endElement():</cities> endElement():</province> startElement():<province> startElement():<cities> startElement():<city> characters():"长沙" endElement():</city> startElement():<city> characters():"衡阳" endElement():</city> startElement():<city> characters():"常德" endElement():</city> endElement():</cities> endElement():</province> endElement():</provinces> endElement():</country> endDocument() called
接下来,我们将该XML文件解析成一个Country 类对象。先定义两个类:
public static class Province{ public String name; public ArrayList<String> cities; public Province(){ name = ""; cities = new ArrayList<String>(5); } } public static class Country{ public String name; public ArrayList<Province> provinces; public Country(){ name = ""; provinces = new ArrayList<Province>(); } }在MyContentHandler中声明如下成员变量:
private Country country; private Province curProvince; private City curCity; private boolean isInCityElement = false;//指示当前事件处于City 元素中,用于获取城市的中文名称
由于该XML文件结构比较简单,我们只需要修改startElement()/character()/endElement()/endDocument()四个成员方法,如下:
@Override public void characters(char[] ch,int length) throws SAXException { if(isInCityElement&&curCity != null){//若当前处于City元素中,则获取城市中文名称 curCity.chName = String.copyValueOf(ch,length); } } @Override public void endDocument() throws SAXException { //print Country object:将解析出来的Country类对象打印出来,验证解析是否正确 System.out.println("country:"+country.name); int size = country.provinces.size(); for(int i = 0;i<size;i++){ Province prc = country.provinces.get(i); System.out.println(" |--"+prc.name); for(City city : prc.cities){ System.out.println(" | |--"+city.enName+"("+city.chName+")"); } } } @Override public void endElement(String uri,String qName) throws SAXException { if(localName.equalsIgnoreCase("province")){ if(curProvince != null){ country.provinces.add(curProvince); curProvince = null; } }else if(localName.equalsIgnoreCase("city")){ if(curProvince !=null&&curCity != null){ curProvince.cities.add(curCity); curCity = null; } isInCityElement = false;//在此标记已不在City元素中 } } @Override public void startElement(String uri,Attributes atts) throws SAXException { isInCityElement = true; if(localName.equalsIgnoreCase("country")){ country = new Country(); country.name = atts.getValue("name"); }else if(localName.equalsIgnoreCase("province")){ curProvince =new Province(); curProvince.name = atts.getValue("name"); }else if(localName.equalsIgnoreCase("city")){ isInCityElement = true;//在此标记进入city元素中 curCity = new City(); curCity.enName = atts.getValue("name"); } }输出结果:
country:China |--GuangDong | |--GuangZhou(广州) | |--ShenZhen(深圳) | |--ZhuHai(珠海) |--HuNan | |--ChangSha(长沙) | |--HengYang(衡阳) | |--ChangDe(常德)