转自:http://blog.csdn.net/zjf280441589/article/details/50613881

XML解析技术有两种 DOM SAX

  • DOM方式 
    根据XML的层级结构在内存中分配一个树形结构,把XML的标签,属性和文本等元素都封装成树的节点对象

    • 优点: 便于实现   
    • 缺点: XML文件过大可能造成内存溢出
  • SAX方式 
    采用事件驱动模型边读边解析:从上到下一行行解析,解析到某一元素, 调用相应解析方法

    • 优点: 不会造成内存溢出,
    • 缺点: 查询不方便,但不能实现   

不同的公司和组织提供了针对DOM和SAX两种方式的解析器

  • SUN的jaxp
  • Dom4j组织的dom4j(最常用:如Spring)
  • JDom组织的jdom 
    关于这三种解析器渊源可以参考java解析xml文件四种方式.

JAXP 解析

JAXP是JavaSE的一部分,在javax.xml.parsers包下,分别针对dom与sax提供了如下解析器:

  • Dom

    • DocumentBuilder
    • DocumentBuilderFactory
  • SAX 
    • SAXParser
    • SAXParserFactory

示例XML如下,下面我们会使用JAXP对他进行   操作

  • config.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE beans SYSTEM "constraint.dtd">
<beans><bean id="id1" class="com.fq.domain.Bean"><property name="isUsed" value="true"/></bean><bean id="id2" class="com.fq.domain.ComplexBean"><property name="refBean" ref="id1"/></bean>
</beans>
  • constraint.dtd
<!ELEMENT beans (bean*) ><!ELEMENT bean (property*)><!ATTLIST beanid CDATA #REQUIREDclass CDATA #REQUIRED><!ELEMENT property EMPTY><!ATTLIST propertyname CDATA #REQUIREDvalue CDATA #IMPLIEDref CDATA #IMPLIED>

JAXP-Dom

/*** @author jifang* @since 16/1/13下午11:24.*/
public class XmlRead {@Testpublic void client() throws ParserConfigurationException, IOException, SAXException {// 生成一个Dom解析器DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();// 解析XML文件Document document = builder.parse(ClassLoader.getSystemResourceAsStream("config.xml"));// ...}
}

DocumentBuilderparse(String/File/InputSource/InputStream param)方法可以将一个XML文件解析为一个Document对象,代表整个文档. 
Document(org.w3c.dom包下)是一个接口,其父接口为NodeNode的其他子接口还有Element Attr Text等.

  • Node
Node常用方法 释义
Node appendChild(Node newChild) Adds the node newChild to the end of the list of children of this node.
Node removeChild(Node oldChild) Removes the child node indicated by oldChild from the list of children, and returns it.
NodeList getChildNodes() A NodeList that contains all children of this node.
NamedNodeMap getAttributes() A NamedNodeMap containing the attributes of this node (if it is an Element) or null otherwise.
String getTextContent() This attribute returns the text content of this node and its descendants.
  • Document
Document常用方法 释义
NodeList getElementsByTagName(String tagname) Returns a NodeList of all the Elements in document order with a given tag name and are contained in the document.
Element createElement(String tagName) Creates an element of the type specified.
Text createTextNode(String data) Creates a Text node given the specified string.
Attr createAttribute(String name) Creates an Attr of the given name.

Dom查询

  • 解析<bean/>标签上的所有属性
public class XmlRead {private Document document;@Beforepublic void setUp() throws ParserConfigurationException, IOException, SAXException {document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(ClassLoader.getSystemResourceAsStream("config.xml"));}@Testpublic void client() throws ParserConfigurationException, IOException, SAXException {NodeList beans = document.getElementsByTagName("bean");for (int i = 0; i < beans.getLength(); ++i) {NamedNodeMap attributes = beans.item(i).getAttributes();scanNameNodeMap(attributes);}}private void scanNameNodeMap(NamedNodeMap attributes) {for (int i = 0; i < attributes.getLength(); ++i) {Attr attribute = (Attr) attributes.item(i);System.out.printf("%s -> %s%n", attribute.getName(), attribute.getValue());// System.out.println(attribute.getNodeName() + " -> " + attribute.getTextContent());}}
}
  • 打印XML文件所有标签名
@Test
public void client() {list(document, 0);
}private void list(Node node, int depth) {if (node.getNodeType() == Node.ELEMENT_NODE) {for (int i = 0; i < depth; ++i)System.out.print("\t");System.out.println("<" + node.getNodeName() + ">");}NodeList childNodes = node.getChildNodes();for (int i = 0; i < childNodes.getLength(); ++i) {list(childNodes.item(i), depth + 1);}
}

Dom添加节点

  • 在第一个<bean/>标签下添加一个<property/>标签,最终结果形式:
<bean id="id1" class="com.fq.domain.Bean"><property name="isUsed" value="true"/><property name="name" value="simple-bean">新添加的</property>
</bean>
/*** @author jifang* @since 16/1/17 下午5:56.*/
public class XmlAppend {// 文档回写器private Transformer transformer;// xml文档private Document document;@Beforepublic void setUp() throws ParserConfigurationException, IOException, SAXException {document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(ClassLoader.getSystemResourceAsStream("config.xml"));}@Testpublic void client() {// 得到第一bean标签Node firstBean = document.getElementsByTagName("bean").item(0);/** 创建一个property标签 **/Element property = document.createElement("property");// 为property标签添加属性// property.setAttribute("name", "name");// property.setAttribute("value", "feiqing");Attr name = document.createAttribute("name");name.setValue("name");property.setAttributeNode(name);Attr value = document.createAttribute("value");value.setValue("simple-bean");property.setAttributeNode(value);// 为property标签添加内容//property.setTextContent("新添加的");property.appendChild(document.createTextNode("新添加的"));// 将property标签添加到bean标签下firstBean.appendChild(property);}@Afterpublic void tearDown() throws TransformerException {transformer = TransformerFactory.newInstance().newTransformer();// 写回XMLtransformer.transform(new DOMSource(document),new StreamResult("src/main/resources/config.xml"));}
}

注意: 必须将内存中的DOM写回XML文档才能生效


Dom更新节点

  • 将刚刚添加的<property/>修改如下
<property name="name" value="new-simple-bean">simple-bean是新添加的</property>
@Test
public void client() {NodeList properties = document.getElementsByTagName("property");for (int i = 0; i < properties.getLength(); ++i) {Element property = (Element) properties.item(i);if (property.getAttribute("value").equals("simple-bean")) {property.setAttribute("value", "new-simple-bean");property.setTextContent("simple-bean是新添加的");break;}}
}

Dom删除节点

删除刚刚修改的<property/>标签

@Test
public void client() {NodeList properties = document.getElementsByTagName("property");for (int i = 0; i < properties.getLength(); ++i) {Element property = (Element) properties.item(i);if (property.getAttribute("value").equals("new-simple-bean")) {property.getParentNode().removeChild(property);break;}}
}

JAXP-SAX

SAXParser实例需要从SAXParserFactory实例的newSAXParser()方法获得, 用于解析XML文件的parse(String uri, DefaultHandler dh)方法没有返回值,但比DOM方法多了一个事件处理器参数DefaultHandler:

  • 解析到开始标签,自动调用DefaultHandlerstartElement()方法;
  • 解析到标签内容(文本),自动调用DefaultHandlercharacters()方法;
  • 解析到结束标签,自动调用DefaultHandlerendElement()方法.

Sax查询

  • 打印整个XML文档
/*** @author jifang* @since 16/1/17 下午9:16.*/
public class SaxRead {@Testpublic void client() throws ParserConfigurationException, IOException, SAXException {SAXParser parser = SAXParserFactory.newInstance().newSAXParser();parser.parse(ClassLoader.getSystemResourceAsStream("config.xml"), new SaxHandler());}private class SaxHandler extends DefaultHandler {@Overridepublic void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {System.out.print("<" + qName);for (int i = 0; i < attributes.getLength(); ++i) {String attrName = attributes.getQName(i);String attrValue = attributes.getValue(i);System.out.print(" " + attrName + "=" + attrValue);}System.out.print(">");}@Overridepublic void characters(char[] ch, int start, int length) throws SAXException {System.out.print(new String(ch, start, length));}@Overridepublic void endElement(String uri, String localName, String qName) throws SAXException {System.out.print("</" + qName + ">");}}
}
  • 打印所有property标签内容的Handler
private class SaxHandler extends DefaultHandler {// 用互斥锁保护isProperty变量private boolean isProperty = false;private Lock mutex = new ReentrantLock();@Overridepublic void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {if (qName.equals("property")) {mutex.lock();isProperty = true;}}@Overridepublic void characters(char[] ch, int start, int length) throws SAXException {// 只有被锁定之后才有可能是trueif (isProperty) {System.out.println(new String(ch, start, length));}}@Overridepublic void endElement(String uri, String localName, String qName) throws SAXException {if (qName.equals("property")) {try {isProperty = false;} finally {mutex.unlock();}}}
}

注: SAX方式不能实现  操作.


Dom4j解析

Dom4j是JDom的一种智能分支,从原先的JDom组织中分离出来,提供了比JDom功能更加强大,性能更加卓越的Dom4j解析器(比如提供对XPath支持). 
使用Dom4j需要在pom中添加如下依赖:

<dependency><groupId>dom4j</groupId><artifactId>dom4j</artifactId><version>1.6.1</version>
</dependency>

示例XML如下,下面我们会使用Dom4j对他进行   操作:

  • config.xml
<?xml version="1.0" encoding="utf-8"?>
<beans xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns="http://www.fq.me/context"xsi:schemaLocation="http://www.fq.me/context http://www.fq.me/context/context.xsd"><bean id="id1" class="com.fq.benz"><property name="name" value="benz"/></bean><bean id="id2" class="com.fq.domain.Bean"><property name="isUsed" value="true"/><property name="complexBean" ref="id1"/></bean>
</beans>
  • context.xsd
<?xml version="1.0" encoding="utf-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"targetNamespace="http://www.fq.me/context"elementFormDefault="qualified"><element name="beans"><complexType><sequence><element name="bean" maxOccurs="unbounded"><complexType><sequence><element name="property" maxOccurs="unbounded"><complexType><attribute name="name" type="string" use="required"/><attribute name="value" type="string" use="optional"/><attribute name="ref" type="string" use="optional"/></complexType></element></sequence><attribute name="id" type="string" use="required"/><attribute name="class" type="string" use="required"/></complexType></element></sequence></complexType></element>
</schema>
/*** @author jifang* @since 16/1/18下午4:02.*/
public class Dom4jRead {@Testpublic void client() throws DocumentException {SAXReader reader = new SAXReader();Document document = reader.read(ClassLoader.getSystemResource("config.xml"));// ...}
}

与JAXP类似Document也是一个接口(org.dom4j包下),其父接口是NodeNode的子接口还有Element Attribute Document Text CDATA Branch

  • Node
Node常用方法 释义
Element getParent() getParent returns the parent Element if this node supports the parent relationship or null if it is the root element or does not support the parent relationship.
  • Document
Document常用方法 释义
Element getRootElement() Returns the root Elementfor this document.
  • Element
Element常用方法 释义
void add(Attribute/Text param) Adds the given Attribute/Text to this element.
Element addAttribute(String name, String value) Adds the attribute value of the given local name.
Attribute attribute(int index) Returns the attribute at the specified indexGets the
Attribute attribute(String name) Returns the attribute with the given name
Element element(String name) Returns the first element for the given local name and any namespace.
Iterator elementIterator() Returns an iterator over all this elements child elements.
Iterator elementIterator(String name) Returns an iterator over the elements contained in this element which match the given local name and any namespace.
List elements() Returns the elements contained in this element.
List elements(String name) Returns the elements contained in this element with the given local name and any namespace.
  • Branch
Branch常用方法 释义
Element addElement(String name) Adds a new Element node with the given name to this branch and returns a reference to the new node.
boolean remove(Node node) Removes the given Node if the node is an immediate child of this branch.

Dom4j查询

  • 打印所有属性信息:
/*** @author jifang* @since 16/1/18下午4:02.*/
public class Dom4jRead {private Document document;@Beforepublic void setUp() throws DocumentException {document = new SAXReader().read(ClassLoader.getSystemResource("config.xml"));}@Test@SuppressWarnings("unchecked")public void client() {Element beans = document.getRootElement();for (Iterator iterator = beans.elementIterator(); iterator.hasNext(); ) {Element bean = (Element) iterator.next();String id = bean.attributeValue("id");String clazz = bean.attributeValue("class");System.out.println("id: " + id + ", class: " + clazz);scanProperties(bean.elements());}}public void scanProperties(List<? extends Element> properties) {for (Element property : properties) {System.out.print("name: " + property.attributeValue("name"));Attribute value = property.attribute("value");if (value != null) {System.out.println("," + value.getName() + ": " + value.getValue());}Attribute ref = property.attribute("ref");if (ref != null) {System.out.println("," + ref.getName() + ": " + ref.getValue());}}}
}

Dom4j添加节点

在第一个<bean/>标签末尾添加<property/>标签

<bean id="id1" class="com.fq.benz"> <property name="name" value="benz"/>  <property name="refBean" ref="id2">新添加的标签</property>
</bean>  
/*** @author jifang* @since 16/1/19上午9:50.*/
public class Dom4jAppend {//...@Testpublic void client() {Element beans = document.getRootElement();Element firstBean = beans.element("bean");Element property = firstBean.addElement("property");property.addAttribute("name", "refBean");property.addAttribute("ref", "id2");property.setText("新添加的标签");}@Afterpublic void tearDown() throws IOException {// 回写XMLOutputFormat format = OutputFormat.createPrettyPrint();XMLWriter writer = new XMLWriter(new FileOutputStream("src/main/resources/config.xml"), format);writer.write(document);}
}

我们可以将获取读写XML操作封装成一个工具, 以后调用时会方便些:

/*** @author jifang* @since 16/1/19下午2:12.*/
public class XmlUtils {public static Document getXmlDocument(String config) {try {return new SAXReader().read(ClassLoader.getSystemResource(config));} catch (DocumentException e) {throw new RuntimeException(e);}}public static void writeXmlDocument(String path, Document document) {try {new XMLWriter(new FileOutputStream(path), OutputFormat.createPrettyPrint()).write(document);} catch (IOException e) {throw new RuntimeException(e);}}
}
  • 在第一个<bean/>的第一个<property/>后面添加一个<property/>标签
<bean id="id1" class="com.fq.benz"> <property name="name" value="benz"/>  <property name="rate" value="3.14"/><property name="refBean" ref="id2">新添加的标签</property>
</bean>  
public class Dom4jAppend {private Document document;@Beforepublic void setUp() {document = XmlUtils.getXmlDocument("config.xml");}@Test@SuppressWarnings("unchecked")public void client() {Element beans = document.getRootElement();Element firstBean = beans.element("bean");List<Element> properties = firstBean.elements();//Element property = DocumentHelper// .createElement(QName.get("property", firstBean.getNamespaceURI()));Element property = DocumentFactory.getInstance().createElement("property", firstBean.getNamespaceURI());property.addAttribute("name", "rate");property.addAttribute("value", "3.14");properties.add(1, property);}@Afterpublic void tearDown() {XmlUtils.writeXmlDocument("src/main/resources/config.xml", document);}
}

Dom4j修改节点

  • id1 bean的第一个<property/>修改如下:
<property name="name" value="翡青"/>  
@Test
@SuppressWarnings("unchecked")
public void client() {Element beans = document.getRootElement();Element firstBean = beans.element("bean");List<Element> properties = firstBean.elements();Element property = DocumentFactory.getInstance().createElement("property", firstBean.getNamespaceURI());property.addAttribute("name", "rate");property.addAttribute("value", "3.14");properties.add(1, property);
}

Dom4j 删除节点

  • 删除刚刚修改的节点
@Test
@SuppressWarnings("unchecked")
public void delete() {List<Element> beans = document.getRootElement().elements("bean");for (Element bean : beans) {if (bean.attributeValue("id").equals("id1")) {List<Element> properties = bean.elements("property");for (Element property : properties) {if (property.attributeValue("name").equals("name")) {// 执行删除动作property.getParent().remove(property);break;}}break;}}
}

Dom4j实例

在Java 反射一文中我们实现了根据JSON配置文件来加载bean的对象池,现在我们可以为其添加根据XML配置(XML文件同前):

/*** @author jifang* @since 16/1/18下午9:18.*/
public class XmlParse {private static final ObjectPool POOL = ObjectPoolBuilder.init(null);public static Element parseBeans(String config) {try {return new SAXReader().read(ClassLoader.getSystemResource(config)).getRootElement();} catch (DocumentException e) {throw new RuntimeException(e);}}public static void processObject(Element bean, List<? extends Element> properties)throws ClassNotFoundException, IllegalAccessException, InstantiationException, NoSuchFieldException {Class<?> clazz = Class.forName(bean.attributeValue(CommonConstant.CLASS));Object targetObject = clazz.newInstance();for (Element property : properties) {String fieldName = property.attributeValue(CommonConstant.NAME);Field field = clazz.getDeclaredField(fieldName);field.setAccessible(true);// 含有value属性if (property.attributeValue(CommonConstant.VALUE) != null) {SimpleValueSetUtils.setSimpleValue(field, targetObject, property.attributeValue(CommonConstant.VALUE));} else if (property.attributeValue(CommonConstant.REF) != null) {String refId = property.attributeValue(CommonConstant.REF);Object object = POOL.getObject(refId);field.set(targetObject, object);} else {throw new RuntimeException("neither value nor ref");}}POOL.putObject(bean.attributeValue(CommonConstant.ID), targetObject);}
}

注: 上面代码只是对象池项目的XML解析部分,完整项目可参考git@git.oschina.net:feiqing/commons-frame.git


XPath

XPath是一门在XML文档中查找信息的语言,XPath可用来在XML文档中对元素和属性进行遍历.

表达式 描述
/ 从根节点开始获取(/beans:匹配根下的<beans/>/beans/bean:匹配<beans/>下面的<bean/>)
// 从当前文档中搜索,而不用考虑它们的位置(//property: 匹配当前文档中所有<property/>)
* 匹配任何元素节点(/*: 匹配所有标签)
@ 匹配属性(例: //@name: 匹配所有name属性)
[position] 位置谓语匹配(例: //property[1]: 匹配第一个<property/>;//property[last()]: 匹配最后一个<property/>)
[@attr] 属性谓语匹配(例: //bean[@id]: 匹配所有带id属性的标签; //bean[@id='id1']: 匹配所有id属性值为’id1’的标签)

谓语: 谓语用来查找某个特定的节点或者包含某个指定的值的节点.

XPath的语法详细内容可以参考W3School XPath 教程.


Dom4j对XPath的支持

默认的情况下Dom4j并不支持XPath, 需要在pom下添加如下依赖:

<dependency><groupId>jaxen</groupId><artifactId>jaxen</artifactId><version>1.1.6</version>
</dependency>

Dom4jNode接口提供了方法对XPath支持:

方法
List selectNodes(String xpathExpression)
List selectNodes(String xpathExpression, String comparisonXPathExpression)
List selectNodes(String xpathExpression, String comparisonXPathExpression, boolean removeDuplicates)
Object selectObject(String xpathExpression)
Node selectSingleNode(String xpathExpression)

XPath实现查询

  • 查询所有bean标签上的属性值
/*** @author jifang* @since 16/1/20上午9:28.*/
public class XPathRead {private Document document;@Beforepublic void setUp() throws DocumentException {document = XmlUtils.getXmlDocument("config.xml");}@Test@SuppressWarnings("unchecked")public void client() {List<Element> beans = document.selectNodes("//bean");for (Element bean : beans) {System.out.println("id: " + bean.attributeValue("id") +", class: " + bean.attributeValue("class"));}}
}

XPath实现更新

  • 删除id=”id2”的<bean/>
@Test
public void client() {Node bean = document.selectSingleNode("//bean[@id=\"id2\"]");bean.getParent().remove(bean);
}

Java解析Xml的三种方式总结相关推荐

  1. java解析xml的三种方法

    java解析XML的三种方法 1.SAX事件解析 package com.wzh.sax;import org.xml.sax.Attributes; import org.xml.sax.SAXEx ...

  2. java解析xml的几种方式

    java解析xml的几种方式 博客分类: java基础备忘-好记性不然烂笔头 XMLJava应用服务器数据结构编程  第一种:DOM. DOM的全称是Document Object Model,也即文 ...

  3. JAVA解析xml的五种方式比较

    1)DOM解析 DOM是html和xml的应用程序接口(API),以层次结构(类似于树型)来组织节点和信息片段,映射XML文档的结构,允许获取和操作文档的任意部分,是W3C的官方标准[优点]①允许应用 ...

  4. java解析xml文件四种方式介绍、性能比较和基本使用方法

    2019独角兽企业重金招聘Python工程师标准>>> 一.基本介绍: 1)DOM(JAXP Crimson解析器) DOM是用与平台和语言无关的方式表示XML文档的官方W3C标准. ...

  5. Android解析XML的三种方式

    在Android中提供了三种解析XML的方式:DOM(Document Objrect Model),SAX(Simple API XML),以及Android推荐的Pull解析方式. 如图: 本篇博 ...

  6. java解析xml的4种方式

    xml是一种常用的标记语言,可以用来传输数据,它的作用与json类似.不过当下web开发中前后台数据加护基本都使用json,但是在maven.spring的配置文件等方面,xml仍有广泛的使用.比如, ...

  7. convert android layout xml,详解Android之解析XML文件三种方式(DOM,PULL,SAX)

    1.xml文件代码 ${fq.content} ${fq.time} 2.XML网页效果图 3.Android代码 1.布局文件 xmlns:tools="http://schemas.an ...

  8. QT解析XML的三种方式

    1-QT QXmlStreamReader用法小结 解析常用到的函数含义: 1--导入一个xml文件或字符串的方式 //方式一QXmlStreamReader reader(sXMLContent); ...

  9. Java处理XML的三种主流技术及介绍

    Java处理XML的三种主流技术及介绍(1) 2012-08-15 10:44 顾彬/冯晨/乔彬 IBM developerWorks 我要评论(0) 字号:T | T XML (eXtensible ...

最新文章

  1. Bootstrap学习-其它内置组件
  2. Jquery加载dom元素
  3. 三年从前端小工到架构-知乎 Live 学习整理
  4. mysql group by怎么用
  5. sizzle分析记录:getAttribute和getAttributeNode
  6. 交换机的4种网络结构方式你知道是什么吗?
  7. java基础---try后小括号(1.7后IO流的关闭方式)
  8. java 个税计算_【JAVA300例】10、计算个人所得税
  9. EJB通过ANT提高EJB应用程序的开发效率、无状态发展本地接口bean、开发状态bean...
  10. 生存在互联网公司是种怎样的体验?
  11. Werkzeug Turorial
  12. Red Hat 6.5 Samba服务器的搭建(匿名访问,免登录)
  13. 秋天视频批量生成GIF V1.32
  14. 文件包含漏洞(完整版)
  15. 神奇的月食画面 超级血月出现天文迷大兴奋
  16. native、方法区
  17. Java 性能笔记:自动装箱/拆箱(转)
  18. OPNET计算机网络仿真 实验作业1 - 网络拓扑创建
  19. yarn的三种调度器
  20. uniapp 英寸尺子 尺子 左右滑动

热门文章

  1. Django开发常用30个软件包
  2. 什么是域名服务器?域名服务器的作用是什么?
  3. 关于spoolsv.exe 报错,并打印服务停止的问题
  4. WebView深究之Android是如何实现webview初始化的
  5. 什么是配置文件 java_java配置文件是什么
  6. Java入门 冒泡排序(第十二天)
  7. 民航大学推出订单式培养空姐 恋爱学系必修课程
  8. 将图片做成gif动态图
  9. unix_timestamp时间比较引发的问题
  10. HTTPS、HTTP2详解