1.开发环境:jdk7,poi3.13

如图:

红色标注的jar包需要自己去下,这个是利用SAX机理处理xml的接口,而POI实现了它(我记得是这样,仅供参考)

2. 背景需求:

前台上传一个xlsx格式的Excel(超过10w条,每条20列),保存到服务器,然后在页面上分页显示出来,将不符合规则的行表红色或黄色,点击处理按钮,将所有   数据插入数据库。

3. 综合考虑:

使用dom方式的XSSFWorkbook是不行的,我实际测试过,每20列,不到1万条就已经内存溢出了,果断采用POI的SAX方式,上面的模块我是利用

springmvc4 + jQuery EasyUI 1.4.1 +hibernate4+mysql5+jdk7,我不可能都贴出来,为便于以后使用,将POI的SAX方式解析这个关键点记录下来。

需求简化:将保存在服务器的xlsx格式的Excel(包含大量数据)解析出来,便于其他使用,(这里我要强调一点,既然是大量数据,我们并不是将所有的数据

一次性都读取出来,放在内存中,而是分页读取,这里分页的概念和客户端分页是一个意思,只不过不是现实在客户端,而是放在内存,做其他用途。。。比如分页插入DB,分页校验)

4.代码:

4.1. 先贴上POI官网的案例,我也是从这里开始的。。。

http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api

我贴一份代码,备份下:

/* ====================================================================Licensed to the Apache Software Foundation (ASF) under one or morecontributor license agreements.  See the NOTICE file distributed withthis work for additional information regarding copyright ownership.The ASF licenses this file to You under the Apache License, Version 2.0(the "License"); you may not use this file except in compliance withthe License.  You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.
==================================================================== */
package org.apache.poi.xssf.eventusermodel.examples;import java.io.InputStream;
import java.util.Iterator;import org.apache.poi.xssf.eventusermodel.XLSX2CSV;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.model.SharedStringsTable;
import org.apache.poi.xssf.usermodel.XSSFRichTextString;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;/*** XSSF and SAX (Event API) basic example.* See {@link XLSX2CSV} for a fuller example of doing*  XSLX processing with the XSSF Event code.*/
public class FromHowTo {public void processFirstSheet(String filename) throws Exception {OPCPackage pkg = OPCPackage.open(filename);XSSFReader r = new XSSFReader( pkg );SharedStringsTable sst = r.getSharedStringsTable();XMLReader parser = fetchSheetParser(sst);// To look up the Sheet Name / Sheet Order / rID,//  you need to process the core Workbook stream.// Normally it's of the form rId# or rSheet#InputStream sheet2 = r.getSheet("rId2");InputSource sheetSource = new InputSource(sheet2);parser.parse(sheetSource);sheet2.close();}public void processAllSheets(String filename) throws Exception {OPCPackage pkg = OPCPackage.open(filename);XSSFReader r = new XSSFReader( pkg );SharedStringsTable sst = r.getSharedStringsTable();XMLReader parser = fetchSheetParser(sst);Iterator<InputStream> sheets = r.getSheetsData();while(sheets.hasNext()) {System.out.println("Processing new sheet:\n");InputStream sheet = sheets.next();InputSource sheetSource = new InputSource(sheet);parser.parse(sheetSource);sheet.close();System.out.println("");}}public XMLReader fetchSheetParser(SharedStringsTable sst) throws SAXException {XMLReader parser =XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");ContentHandler handler = new SheetHandler(sst);parser.setContentHandler(handler);return parser;}/** * See org.xml.sax.helpers.DefaultHandler javadocs */private static class SheetHandler extends DefaultHandler {private SharedStringsTable sst;private String lastContents;private boolean nextIsString;private SheetHandler(SharedStringsTable sst) {this.sst = sst;}public void startElement(String uri, String localName, String name,Attributes attributes) throws SAXException {// c => cellif(name.equals("c")) {// Print the cell referenceSystem.out.print(attributes.getValue("r") + " - ");// Figure out if the value is an index in the SSTString cellType = attributes.getValue("t");if(cellType != null && cellType.equals("s")) {nextIsString = true;} else {nextIsString = false;}}// Clear contents cachelastContents = "";}public void endElement(String uri, String localName, String name)throws SAXException {// Process the last contents as required.// Do now, as characters() may be called more than onceif(nextIsString) {int idx = Integer.parseInt(lastContents);lastContents = new XSSFRichTextString(sst.getEntryAt(idx)).toString();nextIsString = false;}// v => contents of a cell// Output after we've seen the string contentsif(name.equals("v")) {System.out.println(lastContents);}}public void characters(char[] ch, int start, int length)throws SAXException {lastContents += new String(ch, start, length);}}public static void main(String[] args) throws Exception {FromHowTo howto = new FromHowTo();howto.processFirstSheet(args[0]);howto.processAllSheets(args[0]);}
}

-----------------------------------------------------------------------------------------懒惰的分割线---------------------------------------------------------------------------------------------------------------------

我写了2个模式,

第一种模式会将空白单元格过滤掉,这也是在官网案例稍加改动就ok的,但是发现个缺点,就是利用SAX方式解析会自动忽略空白单元格,而跳过它去存储下一个真正有值的单元格,比如三个同一行的单元格A1 B1 C1 D1的值分别为【123】【空白单元格】【空白单元格】【qwe】,存储为{123,qwe},反正我的需求是不允许这样的,于是我有些了第二个模式

第二种模式是会将“空白单元格”的数据存为Null。比如三个同一行的单元格A1 B1 C1 D1的值分别为【123】【空白单元格】【空白单元格】【qwe】,

存储为{123,null,null,qwe},

注意:这里说的空白单元格仅仅指的是没有进行任何编辑的单元格,值为一个或多个空格不是空白单元格

4.2 模式一代码:

package office;/* ====================================================================Licensed to the Apache Software Foundation (ASF) under one or morecontributor license agreements.  See the NOTICE file distributed withthis work for additional information regarding copyright ownership.The ASF licenses this file to You under the Apache License, Version 2.0(the "License"); you may not use this file except in compliance withthe License.  You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.
==================================================================== */import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Pattern;import org.apache.commons.lang.StringUtils;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.model.SharedStringsTable;
import org.apache.poi.xssf.usermodel.XSSFRichTextString;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;/*** XSSF and SAX (Event API) basic example.* See {@link XLSX2CSV} for a fuller example of doing*  XSLX processing with the XSSF Event code.*/
public class MyExcel2007ForPaging {public List<List<String>> dataList = new ArrayList<List<String>>();public final int startRow;public final int endRow;private int currentRow = 0;private final String filename;private List<String> rowData;public MyExcel2007ForPaging(String filename,int startRow,int endRow) throws Exception{if(StringUtils.isBlank(filename)) throw new Exception("文件名不能空");this.filename = filename;this.startRow = startRow;this.endRow = endRow;processFirstSheet();}/*** 指定获取第一个sheet* @param filename* @throws Exception*/private void processFirstSheet() throws Exception {OPCPackage pkg = OPCPackage.open(filename);XSSFReader r = new XSSFReader( pkg );SharedStringsTable sst = r.getSharedStringsTable();XMLReader parser = fetchSheetParser(sst);// To look up the Sheet Name / Sheet Order / rID,//  you need to process the core Workbook stream.// Normally it's of the form rId# or rSheet#InputStream sheet1 = r.getSheet("rId1");InputSource sheetSource = new InputSource(sheet1);parser.parse(sheetSource);sheet1.close();}private XMLReader fetchSheetParser(SharedStringsTable sst) throws SAXException {XMLReader parser =XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");ContentHandler handler = new PagingHandler(sst);parser.setContentHandler(handler);return parser;}/** * See org.xml.sax.helpers.DefaultHandler javadocs */private  class PagingHandler extends DefaultHandler {private SharedStringsTable sst;private String lastContents;private boolean nextIsString;private PagingHandler(SharedStringsTable sst) {this.sst = sst;}/*** 每个单元格开始时的处理*/@Overridepublic void startElement(String uri, String localName, String name,Attributes attributes) throws SAXException {// c => cellif(name.equals("c")) {// Print the cell reference
//                  System.out.print(attributes.getValue("r") + " - ");String index = attributes.getValue("r");//这是一个新行if(Pattern.compile("^A[0-9]+$").matcher(index).find()){//存储上一行数据if(rowData!=null&&isAccess()&&!rowData.isEmpty()){dataList.add(rowData);}rowData = new ArrayList<String>();;//新行要先清除上一行的数据currentRow++;//当前行+1System.out.println(currentRow);}if(isAccess()){// Figure out if the value is an index in the SSTString cellType = attributes.getValue("t");if(cellType != null && cellType.equals("s")) {nextIsString = true;} else {nextIsString = false;}}}// Clear contents cachelastContents = "";}/*** 每个单元格结束时的处理*/@Overridepublic void endElement(String uri, String localName, String name)throws SAXException {if(isAccess()){// Process the last contents as required.// Do now, as characters() may be called more than onceif(nextIsString) {int idx = Integer.parseInt(lastContents);lastContents = new XSSFRichTextString(sst.getEntryAt(idx)).toString();nextIsString = false;}// v => contents of a cell// Output after we've seen the string contentsif(name.equals("v")) {
//                      System.out.println(lastContents);rowData.add(lastContents);}}}@Overridepublic void characters(char[] ch, int start, int length)throws SAXException {if(isAccess()){lastContents += new String(ch, start, length);}}/*** 如果文档结束后,发现读取的末尾行正处在当前行中,存储下这行* (存在这样一种情况,当待读取的末尾行正好是文档最后一行时,最后一行无法存到集合中,* 因为最后一行没有下一行了,所以不为启动starElement()方法,* 当然我们可以通过指定最大列来处理,但不想那么做,扩展性不好)*/@Overridepublic void endDocument ()throws SAXException{if(rowData!=null&&isAccess()&&!rowData.isEmpty()){dataList.add(rowData);System.out.println("--end");}}}private boolean isAccess(){if(currentRow>=startRow && currentRow <= endRow){return true;}return false;}public static void main(String[] args) throws Exception {MyExcel2007ForPaging reader = new MyExcel2007ForPaging("E:/weld_small.xlsx",15,100);System.out.println("\n---"+reader.dataList);}
}

4.3 模式二代码:

package office;/* ====================================================================Licensed to the Apache Software Foundation (ASF) under one or morecontributor license agreements.  See the NOTICE file distributed withthis work for additional information regarding copyright ownership.The ASF licenses this file to You under the Apache License, Version 2.0(the "License"); you may not use this file except in compliance withthe License.  You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.
==================================================================== */import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Pattern;import org.apache.commons.lang.StringUtils;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.model.SharedStringsTable;
import org.apache.poi.xssf.usermodel.XSSFRichTextString;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;/*** XSSF and SAX (Event API) basic example.* See {@link XLSX2CSV} for a fuller example of doing*  XSLX processing with the XSSF Event code.*/
public class MyExcel2007ForPaging_high {public List<List<IndexValue>> dataList = new ArrayList<List<IndexValue>>();private final int startRow;private final int endRow;private int currentRow = 0;private final String filename;private List<IndexValue> rowData;public MyExcel2007ForPaging_high(String filename,int startRow,int endRow) throws Exception{if(StringUtils.isBlank(filename)) throw new Exception("文件名不能空");this.filename = filename;this.startRow = startRow;this.endRow = endRow;processFirstSheet();}/*** 指定获取第一个sheet* @param filename* @throws Exception*/private void processFirstSheet() throws Exception {OPCPackage pkg = OPCPackage.open(filename);XSSFReader r = new XSSFReader( pkg );SharedStringsTable sst = r.getSharedStringsTable();XMLReader parser = fetchSheetParser(sst);// To look up the Sheet Name / Sheet Order / rID,//  you need to process the core Workbook stream.// Normally it's of the form rId# or rSheet#InputStream sheet1 = r.getSheet("rId1");InputSource sheetSource = new InputSource(sheet1);parser.parse(sheetSource);sheet1.close();}private XMLReader fetchSheetParser(SharedStringsTable sst) throws SAXException {XMLReader parser =XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");ContentHandler handler = new PagingHandler(sst);parser.setContentHandler(handler);return parser;}/** * See org.xml.sax.helpers.DefaultHandler javadocs */private  class PagingHandler extends DefaultHandler {private SharedStringsTable sst;private String lastContents;private boolean nextIsString;private String index = null;private PagingHandler(SharedStringsTable sst) {this.sst = sst;}/*** 每个单元格开始时的处理*/@Overridepublic void startElement(String uri, String localName, String name,Attributes attributes) throws SAXException {// c => cellif(name.equals("c")) {// Print the cell reference
//                  System.out.print(attributes.getValue("r") + " - ");index = attributes.getValue("r");System.out.println(index);if(index.contains("N")){System.out.println("##"+attributes+"##");}//这是一个新行if(Pattern.compile("^A[0-9]+$").matcher(index).find()){//存储上一行数据if(rowData!=null&&isAccess()&&!rowData.isEmpty()){dataList.add(rowData);}rowData = new ArrayList<IndexValue>();;//新行要先清除上一行的数据currentRow++;//当前行+1
//                      System.out.println(currentRow);}if(isAccess()){// Figure out if the value is an index in the SSTString cellType = attributes.getValue("t");if(cellType != null && cellType.equals("s")) {nextIsString = true;} else {nextIsString = false;}}}// Clear contents cachelastContents = "";}/*** 每个单元格结束时的处理*/@Overridepublic void endElement(String uri, String localName, String name)throws SAXException {if(isAccess()){// Process the last contents as required.// Do now, as characters() may be called more than onceif(nextIsString) {int idx = Integer.parseInt(lastContents);lastContents = new XSSFRichTextString(sst.getEntryAt(idx)).toString();nextIsString = false;}// v => contents of a cell// Output after we've seen the string contentsif(name.equals("v")) {
//                      System.out.println(lastContents);rowData.add(new IndexValue(index,lastContents));}}}@Overridepublic void characters(char[] ch, int start, int length)throws SAXException {if(isAccess()){lastContents += new String(ch, start, length);}}/*** 如果文档结束后,发现读取的末尾行正处在当前行中,存储下这行* (存在这样一种情况,当待读取的末尾行正好是文档最后一行时,最后一行无法存到集合中,* 因为最后一行没有下一行了,所以不为启动starElement()方法,* 当然我们可以通过指定最大列来处理,但不想那么做,扩展性不好)*/@Overridepublic void endDocument ()throws SAXException{if(rowData!=null&&isAccess()&&!rowData.isEmpty()){dataList.add(rowData);System.out.println("--end");}}}private boolean isAccess(){if(currentRow>=startRow&¤tRow<=endRow){return true;}return false;}private class IndexValue{String v_index;String v_value;public IndexValue(String v_index, String v_value) {super();this.v_index = v_index;this.v_value = v_value;}@Overridepublic String toString() {return "IndexValue [v_index=" + v_index + ", v_value="+ v_value + "]";}public int getLevel(IndexValue p){char[] other = p.v_index.replaceAll("[0-9]", "").toCharArray();char[] self = this.v_index.replaceAll("[0-9]", "").toCharArray();if(other.length!=self.length) return -1;for(int i=0;i<other.length;i++){if(i==other.length-1){return self[i]-other[i];}else{if(self[i]!=other[i]){return -1;}}}return -1;}}/*** 获取真实的数据(处理空格)* @return* @throws Exception */public List<List<String>> getMyDataList() throws Exception{List<List<String>> myDataList = new ArrayList<List<String>>();if(dataList==null||dataList.size()<=0) return myDataList;for(int i=0;i<dataList.size();i++){List<IndexValue> i_list = dataList.get(i);List<String> tem = new ArrayList<String>();int j=0;for(;j< i_list.size()-1;j++){//获取当前值,并存储IndexValue current = i_list.get(j);tem.add(current.v_value);//预存下一个IndexValue next = i_list.get(j+1);//获取差值int level = next.getLevel(current);if(level<=0) throw new Exception("超出处理范围");for(int k = 0;k<level-1;k++){tem.add(null);}}tem.add(i_list.get(j).v_value);myDataList.add(tem);}return myDataList;}public static void main(String[] args) throws Exception {/*System.out.println('O'-'M');System.out.println("O12".hashCode()-"M12".hashCode());String str = "ggg";char[] bm;bm = str.toCharArray();str = String.valueOf(bm); String p1 = "OOM123".replaceAll("[0-9]", "");String p2 = "OOO123".replaceAll("[0-9]", "");System.out.println(p1.hashCode()-p2.hashCode());*//*  List<String> list = new ArrayList<String>();list.add("a");list.add(null);list.add("b");list.add("");list.add("c");list.add(" ");System.out.println(list);System.out.println(list.get(1));System.out.println(null=="null");System.out.println("null".equals(null));*/MyExcel2007ForPaging_high reader = new MyExcel2007ForPaging_high("E:/Y02U_CWS-920-2006_01-R01.xlsx",1,100);System.out.println("\n---"+reader.getMyDataList());}}

4.3 我还写了个快速算出2007版Excel总行数的帮助类,备份下:

package office;/* ====================================================================Licensed to the Apache Software Foundation (ASF) under one or morecontributor license agreements.  See the NOTICE file distributed withthis work for additional information regarding copyright ownership.The ASF licenses this file to You under the Apache License, Version 2.0(the "License"); you may not use this file except in compliance withthe License.  You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.
==================================================================== */import java.io.InputStream;
import java.util.regex.Pattern;import org.apache.commons.lang.StringUtils;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.model.SharedStringsTable;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;/*** XSSF and SAX (Event API) basic example.* See {@link XLSX2CSV} for a fuller example of doing*  XSLX processing with the XSSF Event code.*/
public class MyExcel2007ForMaxRow {//new addpublic long maxRow = 0;//记录总行数private String filename = null;public MyExcel2007ForMaxRow(String filename) throws Exception{if(StringUtils.isBlank(filename)) throw new Exception("文件名不能空");this.filename = filename;processFirstSheet();}/*** 指定获取第一个sheet* @param filename* @throws Exception*/private void processFirstSheet() throws Exception {OPCPackage pkg = OPCPackage.open(filename);XSSFReader r = new XSSFReader( pkg );SharedStringsTable sst = r.getSharedStringsTable();XMLReader parser = fetchSheetParser(sst);// To look up the Sheet Name / Sheet Order / rID,//  you need to process the core Workbook stream.// Normally it's of the form rId# or rSheet#InputStream sheet2 = r.getSheet("rId1");InputSource sheetSource = new InputSource(sheet2);parser.parse(sheetSource);sheet2.close();}private XMLReader fetchSheetParser(SharedStringsTable sst) throws SAXException {XMLReader parser =XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");ContentHandler handler = new MaxRowHandler();parser.setContentHandler(handler);return parser;}/** * See org.xml.sax.helpers.DefaultHandler javadocs */private  class MaxRowHandler extends DefaultHandler {@Overridepublic void startElement(String uri, String localName, String name,Attributes attributes) throws SAXException {// c => cellif(name.equals("c")) {String index = attributes.getValue("r");if(Pattern.compile("A[0-9]+$").matcher(index).find()){maxRow++;}}}}public static void main(String[] args) throws Exception {MyExcel2007ForMaxRow reader = new MyExcel2007ForMaxRow("E:/welding_small.xlsx");System.out.println("\n---"+reader.maxRow);}
}

2019年05月08日 这篇代码已经陈旧,建议用最新的版本

【第01篇】利用POI框架的SAX方式之读取大数据2007版Excel(xlsx)【第1版】相关推荐

  1. UI设计实战篇——利用Bootstrap框架制作查询页面的界面

    Bootstrap框架是一个前端UI设计的框架,它提供了统一的UI界面,简化了设计界面UI的过程(缺点是定制了界面,调整的余地不是太大).尤其是现在的响应时布局(我的理解是页面根据不同的分辨率,采用不 ...

  2. POI以SAX方式解析Excel2007大文件(包含空单元格的处理) Java生成CSV文件实例详解...

    http://blog.csdn.net/l081307114/article/details/46009015 http://www.cnblogs.com/dreammyle/p/5458280. ...

  3. 看完了这篇,还能不知道什么是hadoop,大数据吗?❤️‍万字详解告诉你

    现在的社会是一个高速发展的社会,科技发达,信息流通,人们之间的交流越来越密切,生活也越来越方便,大数据就是这个高科技时代的产物. 文章目录 一.大数据概论 1.1 大数据概念 1.2 大数据特点 1. ...

  4. Poi读取大数据量Excel文件

    前言 最近生产环境有个老项目一直内存报警,不时的还出现内存泄漏,导致需要重启服务器,已经严重影响正常服务了. 分析 1.dump内存文件 liunx使用如下命令: ? 1 ./jmap -dump:f ...

  5. 移动端不利用HTML5和echarts开发一样可以实现大数据展示及炫酷统计系统(产品技术综合)...

    一.由于项目需要进行手机看板展示设计及开发展示效果图如下: 上图为概况(点击相应模块进入详情页面) 上图为运营统计(一些统计类图标状图折线图等......) 车辆分布状况(展示在地图上分布) 上图为点 ...

  6. 利用scrapy框架爬取动态加载的数据

    在爬取有些网站的是后,数据不一定全部是可视化界面的,当我们拖动滚动条时才会加载其他的数据,如果我们也想爬取这部分数据,就需要使用selenium模块,在scrapy里可以结合该模块修改返回对象 一.编 ...

  7. 利用docker搭建服务器集群并部署大数据生态软件

    1.集群搭建与配置 本来想使用centos镜像搭建服务器集群,但最小化安装版的镜像也需要1G左右,如果后面再部署一些大数据软件,单是多台服务器环境部署就会占用大量空间,加上此版本镜像在不同电脑环境的安 ...

  8. 【大数据应用篇】案例解析—直击五个典型的大数据应用示范

    全民互联时代下,互联的人群.应用和事物越来越多,产生的各种形式数据也越来越多.如何才能从海量数据中获取真知灼见呢?在小数据到大数据的演变过程中,业务受阻,我们可能会遇到以下问题: 本来生活网 针对自己 ...

  9. POI读取excel百万级-SAX方式解析

    一. 简介 在excel解析的时候,采用SAX方方式会将excel转换为xml进行解析避免了内存溢出. 速度在3秒1W的数据写入,100W条记录,大概50M的数据,耗时大概4分半(如果不需要校验,可能 ...

最新文章

  1. Linux KVM与Xen的性能比较
  2. Android base64 上传图片
  3. 写在前面-Terraform
  4. CSP认证 201312-1出现次数最多的数[C++题解]:简单题
  5. C语言辅导试题答案,C语言试题含答案).doc
  6. ECCV 2020 论文大盘点-视频理解与分类篇
  7. linux减小根目录空间_Linux目录结构及文件基本操作详解
  8. 用汇编的眼光看C++(之虚函数)
  9. android 微信小程序 唤起app,Android 微信小程序打不开app方案解决
  10. idea中XML注释与取消注释快捷键
  11. 数据结构算法之关键路径
  12. 数据库维护计划中出现错误,数据库无法自动备份。 错误提示:作业失败。所有者(XXX\administrator用户拥有DB维护计划“数据库备份”作业)没有服务器访问权限。
  13. QT报错 error: [debug/qrc_image.cpp] Error 1
  14. 「天才学霸」藏在美团
  15. python到底有多少个库_11个你可能不知道的Python库
  16. Mac电脑程序无响应怎么办?mac强制关闭软件的6种方法
  17. mysql 高并发的解决方案
  18. pc端调用电脑摄像头及麦克风完成录像或录音并实现回放和上传服务器
  19. android 自定义铃声 代码,Android通过代码设置铃声
  20. linux tty pty 的使用

热门文章

  1. 计算机科学期刊是ISTP吗,ISTP期刊是核心期刊吗
  2. NOIP11.15模拟 T2 三部曲
  3. Raspbian镜像无头烧录
  4. 颜色识别opencv+trackbar调节目标颜色hsv的阈值
  5. 亚马逊开店如何优化店铺?
  6. 田纳西大学计算机科学,田纳西大学_田纳西大学(University of Tennessee)
  7. Flutter 手写板 签名
  8. Google、苹果、亚马逊向“自动驾驶”发起进攻
  9. linux下的php编辑器
  10. 谢希仁计算机网络常考知识点,《计算机网络》复习笔记--谢希仁