当下载的日志文件(文本文件)有几十M大小的时候,直接用文本编辑器(notepad++)打开会导致卡死。于是写了一个按字节数均分的文本分割工具TXTSpliterEqualBytes.java ,将文本文件分割成10份(比如原文件50M,分割后生成子文件每个5M)。
        但执行TXTSpliterEqualBytes时可能会遇到一个问题:从第N份子文件开始统统是乱码。原因是按字节均分恰好出现将某个字符(占用超过1个字节)分割的情况。于是又写了一个按字符数均分的文本分割工具TXTSpliterEqualChars.java(比如原文件1千万个字符,分割后生成的每个子文件有1百万字符) 。

下载地址:https://download.csdn.net/download/shushanke/86923522
--------------------------------分割线--------------------------------

TXTSpliterEqualBytes


import java.io.BufferedReader;
import java.io.Closeable;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.CharacterCodingException;
import java.text.DecimalFormat;
/*
javac -d . -encoding UTF-8 TXTSpliterEqualBytes.javajava TXTSpliterEqualBytes文本切割器(按字节数均分,可能分割后的文件乱码。比如恰好某个字符不止一个字节,恰好好被分割到两个文件中。)
*/
public class TXTSpliterEqualBytes {private static final String dirPath = ".";//当前目录//private static final int NUMBER_OF_FILES = 10;//分割成N份private static int NUMBER_OF_FILES = 10;//分割成N份private static String absoluteDirPath = "";//原始文件private static String originalFileName = "";private static DecimalFormat format;private static java.util.LinkedHashSet<String> suffixSetOfTXTFile = new java.util.LinkedHashSet<String>();static {suffixSetOfTXTFile.add(".log");suffixSetOfTXTFile.add(".LOG");suffixSetOfTXTFile.add(".txt");suffixSetOfTXTFile.add(".TXT");suffixSetOfTXTFile.add(".text");suffixSetOfTXTFile.add(".TEXT");if (NUMBER_OF_FILES < 10) {format = new DecimalFormat("0");} else if (NUMBER_OF_FILES < 100) {format = new DecimalFormat("00");} else if (NUMBER_OF_FILES < 1000) {format = new DecimalFormat("000");}getabsoluteDirPath();//计算当前目录的绝对路径findTXTFile();//查找文本文件(找到当前目录的第一个文本)}private static String getabsoluteDirPath() {if ("".equals(absoluteDirPath)) {File dir = new File(dirPath);absoluteDirPath = dir.getAbsolutePath();absoluteDirPath = absoluteDirPath.substring(0, absoluteDirPath.length() -1);//System.out.println("absoluteDirPath==" + absoluteDirPath);if (!absoluteDirPath.endsWith(File.separator)) {absoluteDirPath += File.separator;}}return absoluteDirPath;}private static String findTXTFile() {File dir = new File(absoluteDirPath);boolean findTXT = false;for (File file : dir.listFiles()) {if (file.isFile()) {String fileName = file.getName();int index = fileName.lastIndexOf(".");if (index < 1) {continue;}String suffix = fileName.substring(index, fileName.length());if (suffixSetOfTXTFile.contains(suffix)) {originalFileName = fileName;findTXT = true;break;}}}if (!findTXT) {String tipMsg = "ERROR:请将待分割的文本文件" + suffixSetOfTXTFile.toString() + "放到当前目录下!";System.out.println(tipMsg);throw new RuntimeException(tipMsg);}return absoluteDirPath;}public static void closeCloseable(Closeable closeable) {try {if (closeable != null) {closeable.close();}} catch (Exception e) {e.printStackTrace();}}public static boolean split() {boolean success = false;if (NUMBER_OF_FILES < 2) {System.out.println("分割后的文件个数不能小于2!");return success;}//文件的绝对路径String filePath = absoluteDirPath + originalFileName;File originalFile = new File(filePath);long sizeTotal = originalFile.length();long sizeEach = sizeTotal / NUMBER_OF_FILES;long remainder = sizeTotal % NUMBER_OF_FILES;long[] sizeArray = new long[NUMBER_OF_FILES];for (int i = 0; i < NUMBER_OF_FILES; i++) {sizeArray[i] = sizeEach;}sizeArray[NUMBER_OF_FILES -1] = sizeEach + remainder;FileChannel inChannel = null;FileChannel outChannel = null;try {int index = originalFileName.lastIndexOf(".");String fileName = originalFileName.substring(0, index);String suffix = originalFileName.substring(index, originalFileName.length());StringBuilder sb = new StringBuilder();inChannel = new FileInputStream(originalFile).getChannel();long offset = 0;for (int i = 0; i < NUMBER_OF_FILES; i++) {sb.setLength(0);sb.append(fileName).append("_").append(format.format(i + 1)).append(suffix);String newFileName = absoluteDirPath + sb.toString();long byteNum = sizeArray[i];// 将FileChannel里的全部数据映射到ByteBuffer里MappedByteBuffer buffer = inChannel.map(FileChannel.MapMode.READ_ONLY, offset, byteNum);// ①offset += byteNum;// 创建FileOutputStream,以该文件输出流创建FileChanneloutChannel = new FileOutputStream(newFileName).getChannel();// 直接将buffer里的数据全部输出outChannel.write(buffer);// ②buffer.clear();//position=0,limit=capacity/*// 使用GBK/UTF-8字符集来创建解码器Charset charset = Charset.forName("UTF-8");// 创建解码器(CharsetDecoder)对象CharsetDecoder decoder = charset.newDecoder();// 使用解码器将ByteBuffer转换成CharBufferCharBuffer charBuffer = decoder.decode(buffer);int capacity = charBuffer.capacity();int limit = charBuffer.limit();// true - false, 因为字节数大于字符数(含中文字符)System.out.println((file.length() == capacity) + " - " + (capacity == limit));System.out.println(charBuffer);//输出文件内容*/}//end of for-loopsuccess = true;} catch (FileNotFoundException e) {e.printStackTrace();} catch (CharacterCodingException e) {e.printStackTrace();} catch (IOException e) {e.printStackTrace();} finally {//MyUtil.closeFileChannel(inChannel);//MyUtil.closeFileChannel(outChannel);closeCloseable(inChannel);closeCloseable(outChannel);}return success;}public static void main(String... args)throws Exception {System.out.println("①输入exit并敲回车,结束程序。");System.out.println("②输入大于1的整数(N)并敲回车,将文本分割成N分。");//try-with-resource语法try (BufferedReader bufReader = new BufferedReader(new InputStreamReader(System.in));){String line = null;while ((line = bufReader.readLine()) != null) {System.out.println("本次输入的内容是:" + line);if (line.equalsIgnoreCase("exit")) {break;} else {try {int count = Integer.parseInt(line);if (count < 2) {System.out.println("请输入大于1的整数:");} else {NUMBER_OF_FILES = count;System.out.println("文本将分割成" + NUMBER_OF_FILES + "份");long start = System.currentTimeMillis();boolean success = split();long end = System.currentTimeMillis();if (success) {System.out.println("文本分割已完成,耗时(ms)=" + (end -start));break;}}} catch (NumberFormatException e) {System.out.println("请输入大于1的整数:");}}}} catch (IOException e) {e.printStackTrace();}}
}

--------------------------------分割线--------------------------------

TXTSpliterEqualChars


import java.io.BufferedReader;
import java.io.Closeable;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.CharacterCodingException;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
import java.text.DecimalFormat;
/*
javac -d . -encoding UTF-8 TXTSpliterEqualChars.javajava TXTSpliterEqualChars文本切割器(按字符数均分)
*/
public class TXTSpliterEqualChars {private static final String dirPath = ".";//当前目录//private static final int NUMBER_OF_FILES = 10;//分割成N份private static int NUMBER_OF_FILES = 10;//分割成N份private static String absoluteDirPath = "";public static Charset CHARSET_UTF8 = Charset.forName("UTF-8");// UTF-8字符集,创建解码器/编码器的字符集public static Charset CHARSET_GBK = Charset.forName("GBK");// GBK字符集,创建解码器/编码器的字符集//原始文件private static String originalFileName = "";private static DecimalFormat format;private static java.util.LinkedHashSet<String> suffixSetOfTXTFile = new java.util.LinkedHashSet<String>();static {suffixSetOfTXTFile.add(".log");suffixSetOfTXTFile.add(".LOG");suffixSetOfTXTFile.add(".txt");suffixSetOfTXTFile.add(".TXT");suffixSetOfTXTFile.add(".text");suffixSetOfTXTFile.add(".TEXT");if (NUMBER_OF_FILES < 10) {format = new DecimalFormat("0");} else if (NUMBER_OF_FILES < 100) {format = new DecimalFormat("00");} else if (NUMBER_OF_FILES < 1000) {format = new DecimalFormat("000");}getabsoluteDirPath();//计算当前目录的绝对路径findTXTFile();//查找文本文件(找到当前目录的第一个文本)}private static String getabsoluteDirPath() {if ("".equals(absoluteDirPath)) {File dir = new File(dirPath);absoluteDirPath = dir.getAbsolutePath();absoluteDirPath = absoluteDirPath.substring(0, absoluteDirPath.length() -1);//System.out.println("absoluteDirPath==" + absoluteDirPath);if (!absoluteDirPath.endsWith(File.separator)) {absoluteDirPath += File.separator;}}return absoluteDirPath;}private static String findTXTFile() {File dir = new File(absoluteDirPath);boolean findTXT = false;for (File file : dir.listFiles()) {if (file.isFile()) {String fileName = file.getName();int index = fileName.lastIndexOf(".");if (index < 1) {continue;}String suffix = fileName.substring(index, fileName.length());if (suffixSetOfTXTFile.contains(suffix)) {originalFileName = fileName;findTXT = true;break;}}}if (!findTXT) {String tipMsg = "ERROR:请将待分割的文本文件" + suffixSetOfTXTFile.toString() + "放到当前目录下!";System.out.println(tipMsg);throw new RuntimeException(tipMsg);}return absoluteDirPath;}public static void closeCloseable(Closeable closeable) {try {if (closeable != null) {closeable.close();}} catch (Exception e) {e.printStackTrace();}}public static boolean split() {boolean success = false;if (NUMBER_OF_FILES < 2) {System.out.println("分割后的文件个数不能小于2!");return success;}//文件的绝对路径String filePath = absoluteDirPath + originalFileName;File originalFile = new File(filePath);long sizeTotal = originalFile.length();FileChannel inChannel = null;FileChannel outChannel = null;try {int index = originalFileName.lastIndexOf(".");String fileName = originalFileName.substring(0, index);String suffix = originalFileName.substring(index, originalFileName.length());StringBuilder sb = new StringBuilder();inChannel = new FileInputStream(originalFile).getChannel();MappedByteBuffer byteBuffer = inChannel.map(FileChannel.MapMode.READ_ONLY, 0, sizeTotal);// 创建解码器(CharsetDecoder)对象CharsetDecoder decoder = CHARSET_UTF8.newDecoder();// 使用解码器将ByteBuffer转换成CharBufferCharBuffer charBuffer = decoder.decode(byteBuffer);//int capacity = charBuffer.capacity();//字节数int limit = charBuffer.limit();//字符数?char[] chars = charBuffer.array();// 创建编码器(CharsetEncoder)对象CharsetEncoder encoder = CHARSET_UTF8.newEncoder();long charNumTotal = limit;long charNumEach = charNumTotal / NUMBER_OF_FILES;long charRemainder = charNumTotal % NUMBER_OF_FILES;long[] charNumArray = new long[NUMBER_OF_FILES];for (int i = 0; i < NUMBER_OF_FILES; i++) {charNumArray[i] = charNumEach;}charNumArray[NUMBER_OF_FILES -1] = charNumEach + charRemainder;System.out.println("byteNumTotal=" + sizeTotal);System.out.println("charNumTotal=" + charNumTotal + ", charNumEach=" + charNumEach +  ", charRemainder=" + charRemainder);System.out.println("charBuffer.array().length=" + chars.length);long offset = 0;for (int i = 0; i < NUMBER_OF_FILES; i++) {sb.setLength(0);sb.append(fileName).append("_").append(format.format(i + 1)).append(suffix);String newFileName = absoluteDirPath + sb.toString();long charNum = charNumArray[i];System.out.println("from " + offset + " to " + (offset + charNum) + ", charNum=" + charNum + ", charBuffer.remaining()=" + charBuffer.remaining() );CharBuffer cBuffer = CharBuffer.wrap(chars, (int) offset, (int) charNum);//System.out.println("cBuffer=" + cBuffer);//文本内容offset += charNum;// 使用编码器将CharBuffer转换成ByteBufferByteBuffer bBuffer = encoder.encode(cBuffer);// 创建FileOutputStream,以该文件输出流创建FileChanneloutChannel = new FileOutputStream(newFileName).getChannel();// 直接将buffer里的数据全部输出outChannel.write(bBuffer);// ②bBuffer.clear();//position=0,limit=capacity}//end of for-loopsuccess = true;} catch (FileNotFoundException e) {e.printStackTrace();} catch (CharacterCodingException e) {e.printStackTrace();} catch (IOException e) {e.printStackTrace();} finally {//MyUtil.closeFileChannel(inChannel);//MyUtil.closeFileChannel(outChannel);closeCloseable(inChannel);closeCloseable(outChannel);}return success;}public static void main(String... args)throws Exception {System.out.println("①输入exit并敲回车,结束程序。");System.out.println("②输入大于1的整数(N)并敲回车,将文本分割成N分。");//try-with-resource语法try (BufferedReader bufReader = new BufferedReader(new InputStreamReader(System.in));){String line = null;while ((line = bufReader.readLine()) != null) {System.out.println("本次输入的内容是:" + line);if (line.equalsIgnoreCase("exit")) {break;} else {try {int count = Integer.parseInt(line);if (count < 2) {System.out.println("请输入大于1的整数:");} else {NUMBER_OF_FILES = count;System.out.println("文本将分割成" + NUMBER_OF_FILES + "份");long start = System.currentTimeMillis();boolean success = split();long end = System.currentTimeMillis();if (success) {System.out.println("文本分割已完成,耗时(ms)=" + (end -start));break;}}} catch (NumberFormatException e) {System.out.println("请输入大于1的整数:");}}}} catch (IOException e) {e.printStackTrace();}}
}

--------------------------------分割线--------------------------------

运行环境:JDK 1.7、1.8

windows可执行文件(*.bat)

TXTSpliterEqualChars.bat,内容如下:
javac -d . -encoding UTF-8 TXTSpliterEqualChars.java

java TXTSpliterEqualChars

:pause

TXTSpliterEqualBytes.bat,内容如下:
javac -d . -encoding UTF-8 TXTSpliterEqualBytes.java

java TXTSpliterEqualBytes

:pause

在同目录下放入待分割的文本文件,然后双击可执行文件:

--------------------------------分割线--------------------------------

文本分割器TXTSpliter相关推荐

  1. java中字符串分割器_java简易文本分割器实现代码

    本文实例为大家分享了java文本分割器的具体代码,供大家参考,具体内容如下 import java.io.*; class cutintopieces{ public static void main ...

  2. TXT批量文本分割器Python

    一.软件功能 1.自动识别当前文件夹下的TXT文件并且分割拆分成100M一个文件(需要分割其他大小和格式 稍微修改下代码即可) 2.不会出现分割后乱码或其他编码问题 3.可单独处理一个大文件也可以批量 ...

  3. [开源]基于WPF实现的Gif图片分割器,提取GIf图片中的每一帧

    [开源]基于WPF实现的Gif图片分割器,提取GIf图片中的每一帧 原文:[开源]基于WPF实现的Gif图片分割器,提取GIf图片中的每一帧 不知不觉又半个月没有更新博客了,今天终于抽出点时间,来分享 ...

  4. 章节分割器 v2.0 Beta0618 版

    下载:点击此处下载 章节分割器 v2.0 Beta0618 ===================================== 一个把文本小说按照自定义条件切割成章节的软件,没有多么复杂的设置 ...

  5. java日志切割工具_JavaSwing版本的日志文件分割器

    JavaSwing版本的日志文件分割器 功能: 分割比100M还大的日志为多个100M的左右的小日志 上效果图 使用方法 点击打开按钮 打开一个log文件,然后点击切割按钮 直接上代码 package ...

  6. PDF分割器3.0-将输入文件路径修改为文件选择对话框,并打印分割和保存文件路径,以及文件分割情况

    一.PDF文件分割函数 以前已经讲过了参考PDF分割模块 二.可视化实现 在面的的文章中已经详细讲过了感兴趣的小伙伴可以参考PDF分割器2.0-可视化操作 三.文件选择对话框功能 引用"tk ...

  7. 关于pdf分割器的下载方式和使用

    出于美观和醒目的需要,有时需要将下载出来的较大的文件分割成多个文本.如果利用迅捷PDF分割器来拆分PDF文件,则可以达到事半功倍的效果 .pdf分割软件 http://www.orsoon.com/s ...

  8. python读音发音器-python3 - 文本读音器

    本篇分享的是使用python3制作一个文本读音器,简单点就是把指定的文本文字转语音说出来:做这么个小工具主要是为了方便自己在平时看一些文章眼累的时候,可通过语音来帮助自己,当然如果你是小说迷,可以扩展 ...

  9. 文件分割器,一个读取流,相应多个输出流,并且生成的碎片文件都有有序的编号...

    import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import ja ...

  10. 13 KNN背景分割器

    传统的前景背景分割方法有GrabCut,分水岭算法,当然也包括一些阈值分割的算法.但是这些算法在应用中往往显得鲁棒性较弱,达不到一个好的分割效果. 现代的背景分割算法融入了机器学习的一些方法来提高分类 ...

最新文章

  1. Sql结果导出为excel文件
  2. Android应用程序键盘(Keyboard)消息处理机制分析(3)
  3. python需要学数据结构吗_Python新手学习基础之数据结构-对数据结构的认知
  4. android 锁屏音量,Android锁屏状态获取音量按键事件
  5. 腾讯2020校园招聘----逛街
  6. Linux下使用socket传输文件的C语言简单实现
  7. 75.Android之基本架构
  8. hdu 1890 Robotic SortI(splay区间旋转操作)
  9. php store快捷键设置,mac 下 phpstorm 快捷键整理
  10. Java - 生成健康证图片,各种模板图片
  11. JAVA对接SAP接口使用sapjco3的见解
  12. jsmind结合php,thinkCMF5与jsMind实现文章Mind版
  13. 数据结构——图的基本操作
  14. c语言中pinMode的作用,Arduino编程基础与常用函数(详细)解析
  15. Win10 将 Bookmarks 的书签恢复到 Chrome
  16. 影片剪辑实例名的几点注意
  17. BaiduMap SDK-Location自定义定位图标
  18. mysql通过股票代码查数据_如何在交易数据中查询各个版本交易量前三的股票?(MySQL分组排名)...
  19. 新年新气象,2021来了,用Python换一张头像迎新年吧!
  20. 计算机专业英语形成型考核册,电大资源网《人文英语3》形成性考核册作业题目和答案2018年...

热门文章

  1. 特征选择relief算法介绍
  2. 【TLSR825x】windows下开发环境搭建,固件烧录方法
  3. VUE实现SQL在线编辑器,SQL分析器,SQL代码关键字提示
  4. Hive 安装配置及下载地址
  5. 红帽linux认证内容,红帽认证系统管理员RHCSA认证所要掌握的Linux内容介绍
  6. html 字体样式及结果,css font-family 的种类、安全字体及写法(附样式图)
  7. python实用脚本(二)—— 使用xlrd读取excel
  8. 2022年五一数学建模竞赛C题
  9. 基于情感词典进行情感态度分析
  10. 《数字图像处理 第三版》(冈萨雷斯)——第八章 图像压缩