1.说明

XML是EXtensible Markup Language，
即可扩展标记语言,
是一种通用的数据交换格式,
它的平台无关性、语言无关性、系统无关性，
给数据集成与交互带来了极大的方便。

本文仅介绍Java语言下的XML解析，
XML的解析方式分为四种：

1.DOM解析方式
2.SAX解析方式
3.JDOM解析方式
4.DOM4J解析方式

其中DOM和SAX解析方式属于基础方法，
是JDK提供的平台无关的解析方式；
而JDOM和DOM4J解析方式属于扩展方法，
需要依赖额外的Jar包，
它们是在基础方法上扩展出来的，
而且是专门为Java平台开发的。

另外DOM模型是跨语言的，
DOM在Javascript中也可以使用，
即XML在任何语言中都是一样的模型，
在不同的语言环境中解析方式都是一样的,
只不过实现语法不同而已，
只要理解了DOM，
就可以使用类似的API进行操作，
没有其他学习使用成本。

2.数据准备

下面是要解析的XML文件，
名称为books.xml，
放在工程的src/main/resources目录下：

<?xml version="1.0" encoding="UTF-8"?>
<bookstore><book id="1"><name>Harry Potter</name><author>J. K. Rowling</author><year>1997</year><price>78</price></book><book id="2"><name>Andersen's Fairy Tales</name><author>Andersen</author><year>1875</year><price>166</price></book><book id="3"><name>The Little Prince</name><author>Antoine de Saint-Exupery</author><year>1943</year><price>19</price></book>
</bookstore>

XML文件对应的Book.java类,
用于保存从XML解析到的数据：

package org.w3c.dom;public class Book {private int id;private String name;private String author;private int year;private double price;/*** @return the id*/public int getId() {return id;}/*** @param id the id to set*/public void setId(int id) {this.id = id;}/*** @return the name*/public String getName() {return name;}/*** @param name the name to set*/public void setName(String name) {this.name = name;}/*** @return the author*/public String getAuthor() {return author;}/*** @param author the author to set*/public void setAuthor(String author) {this.author = author;}/*** @return the year*/public int getYear() {return year;}/*** @param year the year to set*/public void setYear(int year) {this.year = year;}/*** @return the price*/public double getPrice() {return price;}/*** @param price the price to set*/public void setPrice(double price) {this.price = price;}@Overridepublic String toString() {return "Book [id=" + id + ", name=" + name + ", author=" + author + ", year=" + year + ", price=" + price + "]";}
}

3.DOM解析方式

DOM是Document Object Model，
即文档对象模型。
在应用程序中，
基于DOM的XML分析器将,
将XML文件全部载入到内存，
组装成一颗DOM树，
应用程序通过这个对象模型，
来实现对XML数据的操作。
通过DOM接口，
应用程序可以在任何时候访问XML文档中的任何一部分数据，
因此这种利用DOM接口的机制也被称作随机访问机制。
由于XML本质上就是一种分层结构，
DOM强制使用树模型来访问XML中的信息,
所以这种方法是相当有效的。

优点：

1.形成了树结构，有助于理解对象模型，容易编写代码。
2.解析过程后，树结构保存在内存中，操作灵活，方便修改。
3.方便定位当前节点的父节点，子节点以及同级节点。

缺点：

1.由于文件是一次性读取，所以对内存的耗费比较大。
2.如果XML文件比较大，容易影响解析性能，且可能会造成内存溢出，不适用于较大文档。

解析代码DomParseXML.java：

package org.w3c.dom;import java.io.IOException;
import java.util.ArrayList;
import java.util.List;import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;import org.xml.sax.SAXException;/*** 用DOM方式解析xml文件,使用JDK提供的类库*/
public class DomParseXML {public static void main(String args[]) throws Exception {String fileName = "src/main/resources/books.xml";Document document = getDocument(fileName);List<Book> books = getBooks(document);for (Book book : books) {System.out.println(book);}}public static Document getDocument(String fileName) throws ParserConfigurationException, SAXException, IOException {// 工厂类可以设置很多XML解析参数DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();DocumentBuilder builder = factory.newDocumentBuilder();// 将给定URI的内容解析为一个 XML文档,并返回Document对象Document document = builder.parse(fileName);return document;}public static List<Book> getBooks(Document document) throws Exception {// 按文档顺序返回在文档中，具有book标签的所有NodeNodeList bookNodes = document.getElementsByTagName("book");// 用于保存解析后的Book对象List<Book> books = new ArrayList<Book>();// 遍历具有book标签的所有Nodefor (int i = 0; i < bookNodes.getLength(); i++) {// 获取第i个book结点Node node = bookNodes.item(i);// 获取第i个book的所有属性NamedNodeMap namedNodeMap = node.getAttributes();// 获取已知名为id的属性值String id = namedNodeMap.getNamedItem("id").getTextContent();Book book = new Book();book.setId(Integer.parseInt(id));// 获取book结点的子节点,包含了text类型的换行NodeList childNodes = node.getChildNodes();// 按照顺序将book里面的属性加入数组ArrayList<String> contents = new ArrayList<>();// 这里由于偶数行是text类型无用节点，所以只取1,3,5,7节点for (int j = 1; j < childNodes.getLength(); j += 2) {Node childNode = childNodes.item(j);String content = childNode.getFirstChild().getTextContent();contents.add(content);}// 将数组中的内容按照顺序写入对象book.setName(contents.get(0));book.setAuthor(contents.get(1));book.setYear(Integer.parseInt(contents.get(2)));book.setPrice(Double.parseDouble(contents.get(3)));// 将解析好的book加入返回列表books.add(book);}return books;}
}

4.SAX解析方式

SAX是Simple APIs for XML，
即XML简单应用程序接口。
与DOM不同，
SAX提供的访问模式是一种顺序模式，
这是一种快速读写XML数据的方式。
当使用SAX分析器对XML文档进行分析时，
会触发一系列事件，
并激活相应的事件处理函数，
应用程序通过这些事件处理函数实现对XML文档的读取，
因此SAX接口的机制也被称作事件驱动机制。

优点：

1.采用事件驱动模式，对内存耗费比较小。
2.适用于只处理XML文件中的数据时。

缺点：

1.编码比较麻烦，需要自定义事件处理类，配合SAX分析器使用。
2.很难同时访问XML文件中的多处不同数据。

解析代码SAXParseHandler.java：

package javax.xml.parsers;import java.util.ArrayList;
import java.util.List;import org.w3c.dom.Book;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;/*** 用SAX方式解析XML文件时需要的handler*/
public class SAXParseHandler extends DefaultHandler {// 存放解析到的book数组private List<Book> books;// 存放当前解析的bookprivate Book book;// 存放当前节点值private String content = null;/*** 开始解析XML文件时调用此方法*/@Overridepublic void startDocument() throws SAXException {super.startDocument();System.out.println("开始解析XML文件");books = new ArrayList<Book>();}/** * 完成解析XML文件时调用此方法*/@Overridepublic void endDocument() throws SAXException {super.endDocument();System.out.println("完成解析XML文件");}/*** 开始解析节点时调用此方法*/@Overridepublic void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {super.startElement(uri, localName, qName, attributes);// 当节点名为book时,获取book的属性idif (qName.equals("book")) {book = new Book();String id = attributes.getValue("id");book.setId(Integer.parseInt(id));}}/***完成解析节点时调用此方法**@param qName 节点名*/@Overridepublic void endElement(String uri, String localName, String qName) throws SAXException {super.endElement(uri, localName, qName);if (qName.equals("name")) {book.setName(content);} else if (qName.equals("author")) {book.setAuthor(content);} else if (qName.equals("year")) {book.setYear(Integer.parseInt(content));} else if (qName.equals("price")) {book.setPrice(Double.parseDouble(content));} else if (qName.equals("book")) {// 当结束当前book解析时,将该book添加到数组后置为空，方便下一次book赋值books.add(book);book = null;}}/** * 此方法用来获取节点的值*/@Overridepublic void characters(char[] ch, int start, int length) throws SAXException {super.characters(ch, start, length);// 将节点的内容保存到content,以便在完成解析节点时调用此方法,// 根据节点名称把content赋给Book的相应字段content = new String(ch, start, length);}public List<Book> getBooks() {return books;}
}

解析代码SaxParseXML.java：

package javax.xml.parsers;import java.util.List;import org.w3c.dom.Book;/*** 用SAX方式解析XML文件，使用JDK提供的类库*/
public class SaxParseXML {public static void main(String[] args) throws Exception {String fileName = "src/main/resources/books.xml";List<Book> books = getBooks(fileName);for (Book book : books) {System.out.println(book);}}public static List<Book> getBooks(String fileName) throws Exception {// SAXParserFactory可以设置XML解析参数SAXParserFactory sParserFactory = SAXParserFactory.newInstance();SAXParser parser = sParserFactory.newSAXParser();SAXParseHandler handler = new SAXParseHandler();// SAXParser解析XML文件时需要配合SAXParseHandler使用parser.parse(fileName, handler);// 从自定义的解析类SAXParseHandler中取出解析结果return handler.getBooks();}}

5.JDOM解析方式

通过引入第三方jar包，
简化与XML的交互，易于使用，
使用类似于上面的DOM解析方式，
但是比使用DOM解析方式更快。
特征：

1.仅使用具体类，而不使用接口。
2.API大量使用了Collections类。

引入jar包：

<dependency><groupId>org.jdom</groupId><artifactId>jdom2</artifactId><version>2.0.6</version>
</dependency>

解析代码JdomParseXML.java：

package org.jdom2.input;import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.List;import org.jdom2.Attribute;
import org.jdom2.Document;
import org.jdom2.Element;
import org.w3c.dom.Book;/*** 用JDOM方式解析xml文件*/
public class JdomParseXML {public static void main(String[] args) throws Exception {String fileName = "src/main/resources/books.xml";List<Book> books = getBooks(fileName);for (Book book : books) {System.out.println(book);}}public static List<Book> getBooks(String fileName) throws Exception {SAXBuilder saxBuilder = new SAXBuilder();Document document = saxBuilder.build(new FileInputStream(fileName));// 获取根节点bookstoreElement bookstore = document.getRootElement();// 获取根节点的子节点，返回子节点的数组List<Element> bookElements = bookstore.getChildren();List<Book> books = new ArrayList<Book>();for (Element bookElement : bookElements) {// 获取Book的id属性值Book book = getBookByElement(bookElement);// 获取Book的name等其他元素的属性值generateBookDetail(bookElement, book);books.add(book);}return books;}public static Book getBookByElement(Element bookElement) {Book book = new Book();// 遍历bookElement的属性,获取id属性的值List<Attribute> bookAttributes = bookElement.getAttributes();for (Attribute attribute : bookAttributes) {if (attribute.getName().equals("id")) {String id = attribute.getValue();book.setId(Integer.parseInt(id));}}return book;}public static void generateBookDetail(Element bookElement, Book book) {// 获取Book标签下的其它标签的值List<Element> children = bookElement.getChildren();for (Element child : children) {String nodeName = child.getName();String nodeValue = child.getValue();if (nodeName.equals("name")) {book.setName(nodeValue);} else if (nodeName.equals("author")) {book.setAuthor(nodeValue);} else if (nodeName.equals("year")) {book.setYear(Integer.parseInt(nodeValue));} else if (nodeName.equals("price")) {book.setPrice(Double.parseDouble(nodeValue));}}}
}

6.DOM4J解析方式

DOM4J是基于JDOM的一个智能分支，
包含了更多的改进和功能，
DOM4J最新的发布包是2020年4月的。
而JDOM最新的发布包是2015年2月的。

通过引入第三方jar包，，
使用类似于上面的SAX解析方式，
所以比JDOM有着更好的性能。

特征：

1.JDOM的一个智能分支，它合并了许多超出基本XML文档表示的功能。
2.它使用接口和抽象基本类方法。
3.具有性能优异、灵活性好、功能强大和极端易用的特点。

引入jar包：

<dependency><groupId>org.dom4j</groupId><artifactId>dom4j</artifactId><version>2.1.3</version>
</dependency>

解析代码Dom4jParseXML .java：

package org.dom4j.io;import java.io.File;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.Element;
import org.w3c.dom.Book;/*** 用DOM4J方式解析XML文件**/
public class Dom4jParseXML {public static void main(String[] args) throws Exception {String fileName = "src/main/resources/books.xml";List<Book> books = getBooks(fileName);for (Book book : books) {System.out.println(book);}}public static List<Book> getBooks(String fileName) throws Exception {SAXReader reader = new SAXReader();File file = new File(fileName);Document document = reader.read(file);Element bookstore = document.getRootElement();Iterator<Element> bookElements = bookstore.elementIterator();List<Book> books = new ArrayList<Book>();while (bookElements.hasNext()) {Element bookElement = bookElements.next();// 获取Book的id属性值Book book = getBookByElement(bookElement);// 获取Book的name等其他元素的属性值generateBookDetail(bookElement, book);books.add(book);}return books;}public static Book getBookByElement(Element bookElement) {Book book = new Book();// 遍历bookElement的属性,获取id属性的值List<Attribute> attributes = bookElement.attributes();for (Attribute attribute : attributes) {if (attribute.getName().equals("id")) {String id = attribute.getValue();book.setId(Integer.parseInt(id));}}return book;}public static void generateBookDetail(Element bookElement, Book book) {// 获取Book标签下的其它标签的值Iterator<Element> details = bookElement.elementIterator();while (details.hasNext()) {Element child = details.next();String nodeName = child.getName();String nodeValue = child.getStringValue();if (nodeName.equals("name")) {book.setName(nodeValue);} else if (nodeName.equals("author")) {book.setAuthor(nodeValue);} else if (nodeName.equals("year")) {book.setYear(Integer.parseInt(nodeValue));} else if (nodeName.equals("price")) {book.setPrice(Double.parseDouble(nodeValue));}}}
}

7.总结

DOM4J性能最好，连Sun的JAXM也在用DOM4J。
目前许多开源项目中大量采用DOM4J，
例如大名鼎鼎的Hibernate也用DOM4J来读取XML配置文件。
如果不考虑可移植性，那就采用DOM4J。

SAX表现较好，这要依赖于它特定的解析方式－事件驱动。
一个SAX检测即将到来的XML流，
但并没有载入到内存，
当然当XML流被读入时，
会有部分文档暂时保存在内存中。

DOM和JDOM在性能测试时表现不佳，
在测试10M文档时内存溢出。
在小文档情况下还值得考虑使用DOM和JDOM。
另外，DOM仍是一个非常好的选择。
DOM实现广泛应用于多种编程语言。
它还是许多其它与XML相关的标准的基础，
因为它正式获得W3C推荐，
所以在某些类型的项目中可能也需要它，
如在JavaScript中使用DOM。

8.参考文档

Java解析XML文件XML解析——Java中XML的四种解析方式

查看全文

http://www.taodudu.cc/news/show-1250974.html

XML解析和创建的JAXB方式
【转载】JSON介绍
Elasticsearch单机安装Version7.10.1
Drools创建Maven工程
Java二、八、十、十六进制介绍
Drools集成SpringBoot
Drools集成SpringBootStarter
Jsonschema2pojo从JSON生成Java类(Maven)
YangTools从YANG生成Java类(Maven)
GitBash添加tree命令
SpringBoot集成Maven工程
SpringBoot开发Restful接口
Notepad++便签模式
SpringBoot集成Cache缓存(Ehcache缓存框架，注解方式)
PowerDesigner生成数据库刷库脚本
PowerDesigner生成数据库设计文档
Eclipse配置国内镜像源
PingInfoView批量PING工具
Git合并两个不同的仓库
Guava事件处理组件Eventbus使用入门
Junit4集成到Maven工程
Redis集成到Maven工程(Jedis客户端)
SpringBoot集成Cache缓存(Redis缓存，RedisTemplate方式)
Junit5集成到Maven工程
Junit5集成到SpringBoot工程
语言代码表
Protobuf生成Java代码(Maven)
Protobuf生成Java代码(命令行)
Maven查看插件信息
SpringBoot脚手架工程快速搭建

XML解析的四种方式相关推荐

java读取XML文件的四种方式
java读取XML文件的四种方式 Xml代码 <?xml version="1.0" encoding="GB2312"?> <RESULT& ...
android xml解析的三种方式
2019独角兽企业重金招聘Python工程师标准>>> 在android开发中,经常用到去解析xml文件,常见的解析xml的方式有一下三种:SAX.Pull.Dom解析方式.最近做了 ...
安卓XML解析的几种方式(DOM,SAX,PULL..)
在android开发中,经常用到去解析xml文件,常见的解析xml的方式有一下三种:SAX.Pull.Dom解析方式.最近做了一个android版的CSDN阅读器,用到了其中的两种(sax,pull) ...
[转]JSon数据解析的四种方式
转至http://blog.csdn.net/enuola/article/details/7903632 作为一种轻量级的数据交换格式,json正在逐步取代xml,成为网络数据的通用格式. 有的js ...
XML解析的几种方式
xml的四种解析方式实例一.DOM(Document Object Model)解析方式在应用程序中,基于DOM的xml分析器将xml文档解析成一个对象模型的集合(通常称DOM树),应用程序正是通 ...
浅谈XMl解析的几种方式
1.XMl 简介: "可扩展标记语言"(XML) 提供一种描述结构化数据的方法.与主要用于控制数据的显示和外观的 HTML 标记不同,XML 标记用于定义数据本身的结构和数据类型 ...
XML解析的三种方式（dom,sax,dom4j）
1.Dom解析: 要解析的xml文件内容: <?xml version="1.0" encoding="utf-8" standalone="n ...
java xml导出_java 导出xml文件的四种方式
public class CreateXML { //DOM方式创建XML文件 public void DOMcreateXML() { DocumentBuilderFactory factory ...
java json 解析_Java解析JSON的四种方式
一.什么是JSON JSON是一种轻量级的数据交换格式,采用完全独立于编程语言的文本格式来存储和表示数据.简洁和清晰的层次结构使得 JSON 成为理想的数据交换语言. 易于阅读和编写,同时也易于解析和 ...

XML解析的四种方式