Excel 电子表格文件格式剖析
2019独角兽企业重金招聘Python工程师标准>>>
Excel 电子表格文件格式,这种Excel和其他的Excel是不同的。他的本质上是一个Xml文件(用英文版的UtraEdit可以看到),所以他可以保存任何符号的字符,包括&(它在Xml文件中是一种特殊字符。所以用aspose等工具是不能读取这种字符的)。
反过来说,我们从中可以得到一种快速生成带有多个Worksheet的Workbook的Excel,从xml文件处理的个角度入手。
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office"><Author>Gary Lim</Author></DocumentProperties><ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
<ProtectStructure>False</ProtectStructure>
<ProtectWindows>False</ProtectWindows>
</ExcelWorkbook>
<Styles>
<Style ss:ID="Default" ss:Name="Normal">
<Alignment ss:Vertical="Bottom"/>
<Borders/>
<Font/>
<Interior/>
<NumberFormat/>
<Protection/>
</Style>
<Style ss:ID="s27" ss:Name="Hyperlink">
<Font ss:Color="#0000FF" ss:Underline="Single"/>
</Style>
<Style ss:ID="s24">
<Font x:Family="Swiss" ss:Bold="1"/>
</Style>
<Style ss:ID="s25">
<Font x:Family="Swiss" ss:Italic="1"/>
</Style>
<Style ss:ID="s26">
<Alignment ss:Horizontal="Center" ss:Vertical="Bottom"/>
</Style>
<Style ss:ID="my24">
<Font x:Family="Swiss" ss:Size="12"/>
</Style>
<Style ss:ID="my28">
<Alignment ss:Horizontal="Left" ss:Vertical="Center"/>
<Font x:Family="Swiss" ss:Size="12" ss:Bold="1"/>
</Style>
<Style ss:ID="my32">
<Alignment ss:Vertical="Center"/>
<Font x:Family="Swiss" ss:Size="12"/>
</Style>
</Styles>
<WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
<Selected/>
</WorksheetOptions>
<Worksheet ss:Name="Info">
<Table>
<Column ss:AutoFitWidth="0" ss:Width="123"/>
<Column ss:AutoFitWidth="0" ss:Width="196.5"/>
<Row ss:AutoFitHeight="0" ss:Height="39.75">
<Cell ss:MergeAcross="1" ss:StyleID="my28"><Data ss:Type="String">Report Information</Data></Cell>
</Row>
<Row ss:Height="15">
<Cell ss:MergeAcross="1" ss:StyleID="my32"/>
</Row>
<Row ss:Height="15">
<Cell ss:StyleID="my24"><Data ss:Type="String">MoAddress</Data></Cell>
<Cell ss:StyleID="my24"><Data ss:Type="String">SN=Eri3G,RNC=BORNC01</Data></Cell>
</Row>
<Row ss:Height="15">
<Cell ss:StyleID="my24"><Data ss:Type="String">Data Date</Data></Cell>
<Cell ss:StyleID="my24"><Data ss:Type="String">6/23/2013 3:25:00 AM</Data></Cell>
</Row>
<Row ss:Height="15">
<Cell ss:StyleID="my24"><Data ss:Type="String">Report Created Date</Data></Cell>
<Cell ss:StyleID="my24"><Data ss:Type="String">6/23/2013 4:04:01 AM</Data></Cell>
</Row>
<Row ss:Height="15">
<Cell ss:StyleID="my24"><Data ss:Type="String">VendorName</Data></Cell>
<Cell ss:StyleID="my24"><Data ss:Type="String">Ericsson</Data></Cell>
</Row>
<Row ss:Height="15">
<Cell ss:StyleID="my24"><Data ss:Type="String">Data Version</Data></Cell>
<Cell ss:StyleID="my24"><Data ss:Type="String">W10.1</Data></Cell>
</Row>
</Table>
</Worksheet>
<Worksheet ss:Name="NodeB_Info">
<Table>
<Column ss:AutoFitWidth="0" ss:Width="300"/>
<Row ss:AutoFitHeight="1" ss:Height="16">
<Cell ss:StyleID="s24"><Data ss:Type="String">MoAddress</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">NodeB Label</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">NodeB Id</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">Last Updated Date</Data></Cell>
</Row>
<Row ss:AutoFitHeight="1" >
<Cell><Data ss:Type="String">SN=Eri3G,RNC=BORNC01,NodeB=BA3036W</Data></Cell>
<Cell><Data ss:Type="String">BA3036W</Data></Cell>
<Cell><Data ss:Type="String">BA3036W</Data></Cell>
<Cell><Data ss:Type="String">6/22/2013 5:01:05 AM</Data></Cell>
</Row>
<Row ss:AutoFitHeight="1" >
<Cell><Data ss:Type="String">SN=Eri3G,RNC=BORNC01,NodeB=BA3039W</Data></Cell>
<Cell><Data ss:Type="String">BA3039W</Data></Cell>
<Cell><Data ss:Type="String">BA3039W</Data></Cell>
<Cell><Data ss:Type="String">6/22/2013 5:01:05 AM</Data></Cell>
</Row>
<Row ss:AutoFitHeight="1" >
<Cell><Data ss:Type="String">SN=Eri3G,RNC=BORNC01,NodeB=BA3040W</Data></Cell>
<Cell><Data ss:Type="String">BA3040W</Data></Cell>
<Cell><Data ss:Type="String">BA3040W</Data></Cell>
<Cell><Data ss:Type="String">6/22/2013 5:01:05 AM</Data></Cell>
</Row>
</Table>
</Worksheet>
</Workbook>
这是一个txt文件,也可以是一个xml文件,当我们把他的后缀名修改成.xls的时候,我们居然可以看到是一个带有两个worksheet的workbook的excel 文件
读取方式:
public class TableReader
{
private List<System.Data.DataTable> _vlstDtDestination = new List<System.Data.DataTable>();
private TableParser _vTableParser = null;
public List<System.Data.DataTable> DataTableList
{
get { return _vlstDtDestination; }
}
public void Read(System.IO.StringReader strReader)
{
System.Xml.XmlReaderSettings xmlReaderSettings = CustomXmlReaderSetting.Create();
using (System.Xml.XmlReader xmlReader = System.Xml.XmlReader.Create(strReader, xmlReaderSettings))
{
string currentNode = string.Empty;
string currentTableName = string.Empty;
KeyValuePair<string, string> currentPair;
while(xmlReader.Read())
{
if (xmlReader.NodeType == System.Xml.XmlNodeType.Element && xmlReader.Name == "Worksheet")
{
currentTableName = xmlReader.GetAttribute("ss:Name");
#region Test
if (currentTableName == "Rnc")
{
Console.WriteLine();
}
#endregion
if (currentTableName == "Info")
{
continue;
}
currentNode = xmlReader.ReadOuterXml();
currentPair=new KeyValuePair<string, string>(currentTableName,currentNode);
Parse(currentPair);
currentPair = new KeyValuePair<string, string>(null,null);
}
}
}
}
#region Private's
private void Parse(KeyValuePair<string, string> source)
{
if (_vTableParser==null)
{
_vTableParser = new TableParser(_vlstDtDestination);
}
_vTableParser.Parse(source);
}
#endregion
}
public class TableParser
{
private List<System.Data.DataTable> _vlstDtDestination = null;
public TableParser(List<System.Data.DataTable> dtList)
{
_vlstDtDestination = dtList;
}
public void Parse(KeyValuePair<string,string> source)
{
if(source.Key.Trim()=="")
{
return;
}
System.Data.DataTable currentDt = null;
foreach(System.Data.DataTable dt in _vlstDtDestination)
{
if(dt.TableName==source.Key)
{
currentDt = dt;
}
}
if(currentDt==null)
{
currentDt = new System.Data.DataTable(source.Key);
_vlstDtDestination.Add(currentDt);
}
if(source.Value.Trim()=="")
{
return;
}
System.IO.StringReader reader = new System.IO.StringReader(source.Value);
System.Xml.XmlReaderSettings xmlReaderSettings = CustomXmlReaderSetting.Create();
using(System.Xml.XmlReader xmlReader=System.Xml.XmlReader.Create(reader,xmlReaderSettings))
{
System.Data.DataRow currentDr = null;
bool isValued = false;
bool isConstructed = false;
int index = 0;
while (xmlReader.Read())
{
switch (xmlReader.NodeType)
{
case System.Xml.XmlNodeType.Element:
if (xmlReader.Name == "Row")
{
if (isValued)
{
currentDt.Rows.Add(currentDr);
}
currentDr = currentDt.NewRow();
index = 0;
}
if (xmlReader.Name == "Data")
{
if (!isConstructed)
{
currentDt.Columns.Add(xmlReader.ReadString(), typeof(System.String));
}
else
{
currentDr[index++] = xmlReader.ReadString();
if (!isValued)
{
isValued = true;
}
}
}
break;
case System.Xml.XmlNodeType.EndElement:
if (xmlReader.Name == "Row" && !isConstructed)
{
isConstructed = true;
}
break;
}
}
if (isValued)
{
currentDt.Rows.Add(currentDr);
}
}
}
}
public class CustomXmlReaderSetting
{
public static System.Xml.XmlReaderSettings Create()
{
System.Xml.XmlReaderSettings xmlReaderSettings = new System.Xml.XmlReaderSettings();
xmlReaderSettings.ProhibitDtd = false;
xmlReaderSettings.CheckCharacters = false;
return xmlReaderSettings;
}
}
public class StreamFilter
{
public static System.IO.StringReader Filter(string path)
{
System.IO.FileStream fileStream = new System.IO.FileStream(
path,
System.IO.FileMode.Open,
System.IO.FileAccess.Read,
System.IO.FileShare.ReadWrite
);
StringBuilder builder = new StringBuilder();
fileStream.Seek(0,System.IO.SeekOrigin.Begin);
System.IO.StreamReader streamReader = new System.IO.StreamReader(fileStream);
for (string strLine = streamReader.ReadLine(); !streamReader.EndOfStream; strLine = streamReader.ReadLine())
{
strLine = strLine.Replace("&", "&");
builder.AppendFormat("{0}\n",strLine);
}
System.IO.StringReader strReader = new System.IO.StringReader(builder.ToString());
return strReader;
}
}
public class NameHandler
{
public static string ConvertName(string name)
{
string result = name.Trim();
result = Regex.Replace(result, "[^a-zA-Z\\d]", "_");
result = Regex.Replace(result, "_{2,}", "_");
if (result.StartsWith("_")) result = result.Substring(1, result.Length - 1);
if (result.EndsWith("_")) result = result.Substring(0, result.Length - 1);
return result;
}
}
The xml iteself is broken up into several major section:
Workbook:Root node of the xml,parent to all the other sections.
DocumentProperties:Most of the importation one sees when accessing File->Properties is defined here.
Styles:Formatting information defined here is available to different row,columns and specific cells in the worksheets.
Worksheet(s):As many worksheets as you want are defined here.
Worksheets consist of two major sections:
Table:This is where all visible data in the spreadsheet is stored.
WorksheetOptions:Global options for the worksheet itself.
Finally,the Table section defines two components:
Column
Row / Cell
With this information in hand one can plan out how to create the xml document that will then load into Microsoft Excel and look,feel and operate like a document created by sophisticate,loving user.
原文来自http://www.6excel.com/doc/20035转载请注明出处
转载于:https://my.oschina.net/yidongkaifa/blog/322655
Excel 电子表格文件格式剖析相关推荐
- 介绍一款开源的类Excel电子表格软件
Excel一直以霸主的地位,占领了Windows桌面表格软件市场No 1,与此同一时候,Office套装产品差点儿成为了IT行业的标配办公技能.有无相似Excel的桌面程序,绿色版,实现主要的数 ...
- Aspose.Cells - 在任何平台上操作Excel电子表格
Aspose.Cells - 在任何平台上操作Excel电子表格 用于创建,编辑,转换和渲染Excel文件的原生API,可在任何平台上将电子表格文档导出为多种格式. Aspose.Cells for ...
- 视频教程-手把手学习Excel电子表格-Office/WPS
手把手学习Excel电子表格 15年以上IT行业工作经验.8年以上IT行业教学经验.丰富的项目经验和授课经验,授课形式不拘一格.熟悉iOS开发,网页开发.Java开发.平面设计等技术,是一名经验丰富的 ...
- Java读取修改xlsm格式表格_Android Excel电子表格API – 在Android应用程序中读取编辑XLS CSV XLSX XLSM HTML格式...
Android Excel Spreadsheet API 更多高级特征 具备格式化工作表,行,列,单元格等能力 Array,ArrayList 和 Recordset / Resultset数据导入 ...
- mac excel 正在计算机,如何在Mac电脑上使用苹果的数字来打开微软的Excel电子表格...
如果你在Mac电脑上使用微软的Excel,你可以保存你创建的电子表格,然后以数字形式打开,这是苹果的电子表格应用程序.如果你无法访问微软的应用程序,这是一个方便的功能. 每台Mac电脑都配有苹果的iW ...
- 3分钟学会在 ASP.NET MVC 中创建、读取和编辑 Excel 电子表格
在本文中,您将学习如何在ASP.NET MVC 应用程序中创建.读取和编辑 Excel 电子表格.为此,我们将创建一个由功能丰富的网格控件组成的电子表格应用程序,用于显示和编辑 Excel 文件,如下 ...
- [转载]使用 Apache 的 POI 和 HSSF 将 Excel 电子表格数据加载到 DB2
使用 Apache 的 POI 和 HSSF 将 Excel 电子表格数据加载到 DB2 在本文中,您将学习如何使用 Apache 的 POI 和 HSSF 构建能将 Microsoft Excel ...
- 计算机用电子表格验证方案,Excel电子表格的验证1.PDF
Excel电子表格的验证1.PDF IZ STUDIO IZ STUDIO OMCL 指南:计算机化系统验证 附录 1:Excel 电子表格的验证 1 Excel 电子表格的验证 注:本指南及其附录中 ...
- 在PHP中创建和编辑Excel电子表格
要使用纯PHP创建或编辑Excel电子表格,我们将使用PHPExcel库,它可以读写许多电子表格格式,包括xls,xlsx,ods和csv.在我们继续之前,仔细检查您的服务器上是否有PHP 5.2或更 ...
- 网页中模拟Excel电子表格实例分享
2019独角兽企业重金招聘Python工程师标准>>> 原文来自http://www.6excel.com/doc/20049 一.电子表格中用到的快捷键: ← → ↑ ↓ :左, ...
最新文章
- 面试之数据库SQL编写实战案例
- linux系统防火墙相关问题及常用命令介绍
- 查看oracle系统信息,查看 ORACLE 系统级信息
- 软件的安装(包括yum仓库与源码包的安装)
- android-Activity
- P1460 健康的荷斯坦奶牛 Healthy Holsteins (简单的dfs)
- 论文浅尝 | 基于知识图谱 Embedding 的问答
- 怎么查看php是否安装了symfony_为什么开发人员讨厌PHP???
- Oracle中的用户创建和权限的分配
- 一些NER的英文数据集
- Java实训:学生信息管理系统
- 10道经典java面试题_实习生必问(java基础)
- JMeter Linux下执行测试
- 童年学习机器人的 5 大好处
- 工作积累⑨——从丁香医生增长看地推的重要性
- 淘宝宝贝图片批量下载教程
- Web安全工具—Sqlmap常用命令和参数(持续更新)
- Exynos_4412——ADC实验
- 湖南省株洲市谷歌高清卫星地图下载
- arduino定时器