
XML is a form of semi structured data which is organized in the form of trees. Semi structured data is helpful when you serialize the program data for saving in a file or shipping across a network. It defines a standardized document which is easy to read an interpret. XML stands for eXtensible Markup Language.

XML是半结构化数据的一种形式,它以树的形式进行组织。 当您序列化程序数据以保存在文件中或通过网络传送时,半结构化数据很有用。 它定义了易于阅读解释的标准化文档。 XML代表可扩展标记语言

XML consists of two basic elements text and tags. Text is a sequence of characters. Tags consists of a less than sign alphanumeric character and greater than sign. An end tag is same as start tag except that it consists of a slash in the end. Start tag and end tag must have the same label.

XML由文本和标签这两个基本元素组成。 文本是一个字符序列。 标签由小于符号的字母数字字符和大于符号的字符组成。 结束标记与开始标记相同,不同之处在于它在结尾处包含一个斜杠。 起始标签和结束标签必须具有相同的标签。

For example;



Above is valid XML as the start and end tag match each other.


<school><standard>6</standard> 7

Above is invalid XML as the end tag is not specified.


<school><standard>8 </school></standard>

Above XML is also invalid because the standard tag which is the child should be closed first and then the parent tag school should be closed.


Since tags have to be matched, XML are structured as nested elements. The start and end tags forms a pair of matching elements and elements can be nested within each other. In the above example standard is the nested element.

由于必须匹配标签,因此XML被构造为嵌套元素。 开始和结束标签形成一对匹配元素,并且元素可以彼此嵌套。 在上面的示例中,标准是嵌套元素。

The shorthand notation which is the start tag followed by the slash indicates the start and end tag. One tag with a slash indicates an empty element.

缩写符号是开始标签,后跟斜杠,表示开始标签和结束标签。 一个带斜杠的标记表示一个空元素。

For instance in below XML standard is an empty element.

例如,在下面的XML standard是一个空元素。

<school> <standard /> </school>

Start tags can have attributes. An attribute is a name value pair with an equal sign in the middle. The attribute is surrounded by double quotes or single quotes.

开始标签可以具有属性。 属性是中间带有等号的名称/值对。 该属性用双引号或单引号引起来。

For instance


<standard section ="A" strings = "true"></standard>

Now that we have a brief knowledge of XML, let’s look over different things we can do in Scala for XML processing.


Scala XML文字 (Scala XML Literals)

Type a start tag and then continue writing the XML content. The XML contents are read until the end tag is seen.

键入开始标记,然后继续编写XML内容。 读取XML内容,直到看到结束标记。

For example, Open the Scala REPL shell and execute the code as

例如,打开Scala REPL shell并执行以下代码

<a>Scala is a functional Programming language</a>

Scala expression can be evaluated in the tag value using curly braces. For example;

可以使用花括号在标记值中评估Scala表达。 例如;

<a> {"hi"+",Reena"} </a>

Output: res1: scala.xml.Elem = <a> hi,Reena </a>

输出 :res1:scala.xml.Elem = <a>嗨,雷娜</a>

A brace escape can include arbitrary scala content including XML literals. For example;

大括号转义可以包含任意scala内容,包括XML文字。 例如;

val marks = 78<a> { if ( marks < 80) <marks> {marks} </marks> else xml.NodeSeq.Empty } </a>

Output: res3: scala.xml.Elem = <a> <marks> 78 </marks> </a>

输出 :res3:scala.xml.Elem = <a> <marks> 78 </ marks> </a>

The code inside the curly braces are evaluated to an XML node or a sequence of XML nodes. In the above example if the marks is less than 80 it is added to <a> element else nothing is added.

花括号内的代码被评估为一个XML节点或一系列XML节点。 在上面的示例中,如果标记小于80,则将其添加到<a>元素中,否则不添加任何内容。

The expression inside the brace is evaluated to a scala value and then converted to string and inserted as text.


<a> {9+40} </a>

Output: res4: scala.xml.Elem = <a> 49 </a>

输出 :res4:scala.xml.Elem = <a> 49 </a>

The <, >, and & characters in the text will be escaped if you print the node.


<a> {"</a>Hello Scala<a>"} </a>

Output: res5: scala.xml.Elem = <a> </a>Hello Scala<a> </a>

输出 :res5:scala.xml.Elem = <a> </a>你好Scala <a> </a>

Below image shows all the above Scala XML Literals processing in scala shell.

下图显示了上述所有在Scala Shell中的Scala XML文字处理。

Scala中的序列化 (Serialization in Scala)

Serialization converts the internal data structure to XML so that the data can be stored, transmitted or reused. Use XML literals and brace escapes to convert to XML. Use the toXML method that supports XML literals and brace escapes.

序列化将内部数据结构转换为XML,以便可以存储,传输或重用数据。 使用XML文字和大括号转义符转换为XML。 使用支持XML文字和大括号转义的toXML方法。

For example first of all we will define Student class and create an instance of it.


scala> abstract class Student {val name:Stringval id:Intval marks:Intoverride def toString = namedef toXML =<student><name>{name}</name><id>{id}</id><marks>{marks}</marks></student>}scala> val stud = new Student {val name = "Rob"val id = 12val marks =90}scala> stud.toXML
res7: scala.xml.Elem =

Below image shows the scala serialization process in scala shell.

下图显示了Scala Shell中的Scala序列化过程。

Scala XML解析 (Scala XML Parsing)

There are many methods available for XML classes. Let us now see a very useful method as how to extract text, sub elements and attributes.

XML类有很多可用的方法。 现在让我们看到一个非常有用的方法,即如何提取文本,子元素和属性。

Extracting Text
The text method on the XML node retrieves the text within that node. For example;

XML节点上的text方法检索该节点内的文本。 例如;

scala> <a>Scala is a <p>programming</p> language </a>.text
Output: res8: String = "Scala is a programming language "

Here the tags are excluded from the output.


Extracting sub-elements


The sub elements are extracted by calling \\ followed by tag name. For example;

通过调用\\后跟标签名称来提取子元素。 例如;

scala> <school><standard><section>C</section></standard></school> \\"section"
Output:res21: scala.xml.NodeSeq = NodeSeq(<section>C</section>)scala> <school><standard><section>C</section></standard></school> \\"school"
Output:res22:scala.xml.NodeSeq = NodeSeq(<school><standard><section>C</section></standard></school>)

Below image shows the above xml parsing examples in scala shell.

下图显示了Scala Shell中的上述xml解析示例。

Scala提取XML属性 (Scala Extracting XML attributes)

Tag attributes are extracted using the same \ and \\ methods with an at sign (@) before the attribute name. For example;

使用相同的\和\\方法(在属性名称之前带有at符号(@))提取标记属性。 例如;

scala> val adam = <studentname = "Adam"id ="12"marks = "65" />
Output:adam: scala.xml.Elem = <student name="Adam" id="12" marks="65"/>scala> adam \\"@name"
Output:res3: scala.xml.NodeSeq = NodeSeq(Adam)scala> adam \\"@iduct"
Output:res5: scala.xml.NodeSeq = NodeSeq(12)

Scala反序列化示例 (Scala De-serialization example)

The XML is converted back to the internal data structure for the program to use. For example;

XML被转换回内部数据结构以供程序使用。 例如;

The Student class created during serialization process shall be used as the student class and the toXML methods are used.


scala> def fromXML(node: scala.xml.Node): Student =
new Student {val name   = (node  \"name").textval id      = (node  \"id").text.toIntval marks  = (node  \"marks").text.toInt}

Output: fromXML: (node: scala.xml.Node)Student

输出 :fromXML:(节点:scala.xml.Node)

Now call the stud created in the serialization and print the xml content as below.


scala> val stud = new Student {val name = "Rob"val id = 12val marks =90}

Now invoke toXML method as;


scala> val st = stud.toXML
st: scala.xml.Elem =
<student><name>Rob</name><id>12</id><marks>90</marks> </student>

Call the fromXML method as;


Output:res17: Student = Rob

Scala XML保存到文件并从文件加载 (Scala XML Saving into file and Loading from file)

The XML.saveFull command is used to convert data to a file of bytes. The first argument is the file name to which the node is to be saved, second is the node, third is the character encoding, fourth is whether to write an XML declaration at the top that includes the character encoding and finally the fifth is the document type.

XML.saveFull命令用于将数据转换为字节文件。 第一个参数是节点要保存到的文件名,第二个是节点,第三个是字符编码,第四个是是否在顶部写一个包含字符编码的XML声明,最后一个是文档类型。

For example;



We are using the st node created above in the de-serialization process.


Now open the stud.xml file which stores the following contents:


<?xml version='1.0' encoding='UTF-8'?>

Now for loading the file we can use the load method as;


scala> val s1 = xml.XML.load("stud.xml")
s1: scala.xml.Elem =

That’s all for XML processing in Scala programming, we will look into more Scala features in coming posts.




