将DOCX文档转化为PDF是项目中常见的需求之一,目前主流的方法可以分为两大类,一类是利用各种Office应用进行转换,譬如Microsoft Office、WPS以及LiberOffice,另一种是利用各种语言提供的对于Office文档读取的接口(譬如Apache POI)然后使用专门的PDFGenerator库,譬如IText进行PDF构建。总的来说,从样式上利用Office应用可以保证较好的样式,不过相对而言效率会比较低。其中Microsoft Office涉及版权,不可轻易使用(笔者所在公司就被抓包了),WPS目前使用比较广泛,不过存在超链接截断问题,即超过256个字符的超链接会被截断,LiberOffice的样式排版相对比较随意。而利用POI接口进行读取与生成的方式性能较好,适用于对于格式要求不是很高的情况。另外还有一些封装好的在线工具或者命令行工具,譬如docx2pdf与OfficeTOpdf

以下是Apache POI实现word转pdf

1.maven jar

<dependency>
<groupId>args4j</groupId>
<artifactId>args4j</artifactId>
<version>2.32</version>
</dependency>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j</artifactId>
<version>3.2.1</version>
</dependency>
<dependency>
<groupId>fr.opensagres.xdocreport</groupId>
<artifactId>org.apache.poi.xwpf.converter.pdf</artifactId>
<version>1.0.6</version>
</dependency><dependency>
<groupId>fr.opensagres.xdocreport</groupId>
<artifactId>org.odftoolkit.odfdom.converter.pdf</artifactId>
<version>1.0.6</version>
</dependency>
<dependency><groupId>com.googlecode.jaxb-namespaceprefixmapper-interfaces</groupId><artifactId>JAXBNamespacePrefixMapper</artifactId><version>2.2.4</version><scope>runtime</scope></dependency><dependency>
<groupId>com.sun.xml.bind</groupId>
<artifactId>jaxb-impl</artifactId>
<version>2.2.11</version>
</dependency>
<dependency>
<groupId>com.sun.xml.bind</groupId>
<artifactId>jaxb-core</artifactId>
<version>2.2.11</version>
</dependency><!-- https://mvnrepository.com/artifact/org.apache.xmlbeans/xmlbeans -->
<dependency><groupId>org.apache.xmlbeans</groupId><artifactId>xmlbeans</artifactId><version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>3.14</version><!--$NO-MVN-MAN-VER$-->
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-scratchpad</artifactId>
<version>3.14</version><!--$NO-MVN-MAN-VER$-->
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>3.14</version><!--$NO-MVN-MAN-VER$-->
</dependency>

2.实现类

Converter

import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;public abstract class Converter {private final String LOADING_FORMAT = "\nLoading stream\n\n";
private final String PROCESSING_FORMAT = "Load completed in %1$dms, now converting...\n\n";
private final String SAVING_FORMAT = "Conversion took %1$dms.\n\nTotal: %2$dms\n";private long startTime;
private long startOfProcessTime;protected InputStream inStream;
protected OutputStream outStream;protected boolean showOutputMessages = false;
protected boolean closeStreamsWhenComplete = true;public Converter(InputStream inStream, OutputStream outStream, boolean showMessages, boolean closeStreamsWhenComplete){
this.inStream = inStream;
this.outStream = outStream;
this.showOutputMessages = showMessages;
this.closeStreamsWhenComplete = closeStreamsWhenComplete;
}public abstract void convert() throws Exception;private void startTime(){
startTime = System.currentTimeMillis();
startOfProcessTime = startTime;
}protected void loading(){
sendToOutputOrNot(String.format(LOADING_FORMAT));
startTime();
}protected void processing(){
long currentTime = System.currentTimeMillis();
long prevProcessTook = currentTime - startOfProcessTime;sendToOutputOrNot(String.format(PROCESSING_FORMAT, prevProcessTook));startOfProcessTime = System.currentTimeMillis();}protected void finished(){
long currentTime = System.currentTimeMillis();
long timeTaken = currentTime - startTime;
long prevProcessTook = currentTime - startOfProcessTime;startOfProcessTime = System.currentTimeMillis();if(closeStreamsWhenComplete){
try {
inStream.close();
outStream.close();
} catch (IOException e) {
//Nothing done
}
}sendToOutputOrNot(String.format(SAVING_FORMAT, prevProcessTook, timeTaken));
}private void sendToOutputOrNot(String toBePrinted){
if(showOutputMessages){
actuallySendToOutput(toBePrinted);
}
}protected void actuallySendToOutput(String toBePrinted){
}}

DocToPDFConverter:

import java.io.InputStream;
import java.io.OutputStream;
import java.io.PrintStream;
import java.net.URL;import org.apache.commons.io.IOUtils;
import org.docx4j.Docx4J;
import org.docx4j.convert.in.Doc;
import org.docx4j.convert.out.FOSettings;
import org.docx4j.fonts.IdentityPlusMapper;
import org.docx4j.fonts.Mapper;
import org.docx4j.fonts.PhysicalFont;
import org.docx4j.fonts.PhysicalFonts;
import org.docx4j.jaxb.Context;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.wml.RFonts;
import org.springframework.core.io.ClassPathResource;
import org.springframework.core.io.Resource;public class DocToPDFConverter extends Converter {public DocToPDFConverter(InputStream inStream, OutputStream outStream, boolean showMessages,
boolean closeStreamsWhenComplete) {
super(inStream, outStream, showMessages, closeStreamsWhenComplete);
}@Override
public void convert() throws Exception {
loading();InputStream iStream = inStream;
try {
WordprocessingMLPackage wordMLPackage = getMLPackage(iStream);
Mapper fontMapper = new IdentityPlusMapper();
String fontFamily = "SimSun";Resource fileRource = new ClassPathResource("simsun.ttf");
String path =  fileRource.getFile().getAbsolutePath();
URL fontUrl = new URL("file:"+path);
PhysicalFonts.addPhysicalFont(fontUrl);PhysicalFont simsunFont = PhysicalFonts.get(fontFamily);
fontMapper.put(fontFamily, simsunFont);RFonts rfonts = Context.getWmlObjectFactory().createRFonts(); // 设置文件默认字体
rfonts.setAsciiTheme(null);
rfonts.setAscii(fontFamily);
wordMLPackage.getMainDocumentPart().getPropertyResolver().getDocumentDefaultRPr().setRFonts(rfonts);
wordMLPackage.setFontMapper(fontMapper);
FOSettings foSettings = Docx4J.createFOSettings();
foSettings.setWmlPackage(wordMLPackage);
Docx4J.toFO(foSettings, outStream, Docx4J.FLAG_EXPORT_PREFER_XSL);} catch (Exception ex) {
ex.printStackTrace();
} finally {
IOUtils.closeQuietly(outStream);
}/*
* InputStream iStream = inStream;
*
*
*
* String regex = null; //Windows: // String
* regex=".*(calibri|camb|cour|arial|symb|times|Times|zapf).*"; regex=
* ".*(calibri|camb|cour|arial|times|comic|georgia|impact|LSANS|pala|tahoma|trebuc|verdana|symbol|webdings|wingding).*";
* // Mac // String //
* regex=".*(Courier New|Arial|Times New Roman|Comic Sans|Georgia|Impact|Lucida Console|Lucida Sans Unicode|Palatino Linotype|Tahoma|Trebuchet|Verdana|Symbol|Webdings|Wingdings|MS Sans Serif|MS Serif).*"
* ; PhysicalFonts.setRegex(regex); WordprocessingMLPackage
* wordMLPackage = getMLPackage(iStream); // WordprocessingMLPackage
* wordMLPackage = WordprocessingMLPackage.load(iStream) FieldUpdater
* updater = new FieldUpdater(wordMLPackage); updater.update(true); //
* process processing(); // Add font
*
* Mapper fontMapper = new IdentityPlusMapper();
*
* PhysicalFont font = PhysicalFonts.get("Arial UTF-8 MS"); if (font !=
* null) { fontMapper.put("Times New Roman", font);
* fontMapper.put("Arial", font); fontMapper.put("Calibri", font); }
* fontMapper.put("Calibri", PhysicalFonts.get("Calibri"));
* fontMapper.put("Algerian", font); fontMapper.put("华文行楷",
* PhysicalFonts.get("STXingkai")); fontMapper.put("华文仿宋",
* PhysicalFonts.get("STFangsong")); fontMapper.put("隶书",
* PhysicalFonts.get("LiSu")); fontMapper.put("Libian SC Regular",
* PhysicalFonts.get("SimSun"));
* wordMLPackage.setFontMapper(fontMapper); FOSettings foSettings =
* Docx4J.createFOSettings(); foSettings.setFoDumpFile(new
* java.io.File("E:/xi.fo")); foSettings.setWmlPackage(wordMLPackage);
* // Docx4J.toPDF(wordMLPackage, outStream); Docx4J.toFO(foSettings,
* outStream, Docx4J.FLAG_EXPORT_PREFER_XSL);
*/
finished();}protected WordprocessingMLPackage getMLPackage(InputStream iStream) throws Exception {
//PrintStream originalStdout = System.out;System.setOut(new PrintStream(new OutputStream() {
public void write(int b) {
// DO NOTHING
}
}));WordprocessingMLPackage mlPackage = Doc.convert(iStream);
//System.setOut(originalStdout);
//System.out.println(outStream);
return mlPackage;
}}

DocxToPDFConverter:

import java.awt.Color;
import java.io.InputStream;
import java.io.OutputStream;import org.apache.poi.xwpf.converter.pdf.PdfConverter;
import org.apache.poi.xwpf.converter.pdf.PdfOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.springframework.core.io.ClassPathResource;
import org.springframework.core.io.Resource;import com.lowagie.text.Font;
import com.lowagie.text.pdf.BaseFont;import fr.opensagres.xdocreport.itext.extension.font.ITextFontRegistry;public class DocxToPDFConverter extends Converter {public DocxToPDFConverter(InputStream inStream, OutputStream outStream, boolean showMessages,
boolean closeStreamsWhenComplete) {
super(inStream, outStream, showMessages, closeStreamsWhenComplete);
}@Override
public void convert() throws Exception {
loading();PdfOptions options = PdfOptions.create();
XWPFDocument document = new XWPFDocument(inStream);//支持中文字体
options.fontProvider(new ITextFontRegistry() {
public Font getFont(String familyName, String encoding, float size, int style, Color color) {
try {
Resource fileRource = new ClassPathResource("simsun.ttf");
String path =  fileRource.getFile().getAbsolutePath();BaseFont bfChinese = BaseFont.createFont(path, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Font fontChinese = new Font(bfChinese, size, style, color);
if (familyName != null)
fontChinese.setFamily(familyName);
return fontChinese;
} catch (Throwable e) {
e.printStackTrace();
return ITextFontRegistry.getRegistry().getFont(familyName, encoding, size, style, color);
}
}});processing();
PdfConverter.getInstance().convert(document, outStream, options);finished();
}}

main 方法的实现代码

Converter converter;

path = request.getSession().getServletContext().getRealPath("").replaceAll("\\\\", "/") + "/flyingsauser/preview.pdf";
File file = new File(path);
OutputStream outputStream = new FileOutputStream(file);
String url = attachmentEntity.getUrl();
inputStream = OSSClientUtil.getFileObject(url);if(!file.exists()){
file.createNewFile();
}
if(url.endsWith(".docx")) {
converter = new DocxToPDFConverter(inputStream, outputStream, true, true);
converter.convert();
fileInputStream = new FileInputStream(file);
} else if(url.endsWith(".doc")){
converter = new DocToPDFConverter(inputStream, outputStream, true, true);
converter.convert();
fileInputStream = new FileInputStream(file);

以上就是word转pdf的实现,里面添加了对中文的支持需要添加simsun.ttf。

具体源码实现参照了下方的github的代码

https://github.com/yeokm1/docs-to-pdf-converter

java 实现word转pdf相关推荐

  1. Aspose.Java实现word转pdf,添加水印等操作

    Aspose.Java实现word转pdf,添加水印等操作 一. word转pdf 二. 文档插入水印 Aspose是一款商用版控件,支持各类文档操作,这里主要介绍如何在Springboot项目中使用 ...

  2. txt doc rtf html,JAVA读取WORD,EXCEL,PDF,TXT,RTF,HTML文件文本内容的方法示例.docx

    JAVA读取WORD,EXCEL,PDF,TXT,RTF,HTML文件文本内容的方法示例 JAVA读取WORD,EXCEL,PDF,TXT,RTF,HTML文件文本内容的方法示例??2012-06-2 ...

  3. java实现word、pdf文件下载功能

    在SpringMVC的开发过程中,有时需要实现文档的下载功能.文档的下载功能涉及到了java IO流操作的基础知识,下面本文详细介绍java如何实现后台文档下载功能. 首先根据文档在项目中的存储路径建 ...

  4. java 模板 word转pdf 可分页 带图片

    java 模板 word转pdf 可分页 带图片 之前写过一个简单的案例,但是在项目中完全不能满足客户的需求,所以重新用啦一种方式来写,采用了word转换pdf的方式,这种经过不断研究,满足了可分页, ...

  5. Java实现Word转PDF方案选择

    Java实现Word转PDF方案选择 很多应用场景中都会涉及到Word转PDF,但Word转PDF的方案在网上一搜一大把,让人眼花缭乱,笔者踩过无数的坑后,最终总结出以下三种方案 OpenOffice ...

  6. [JAVA使用技巧]Java抽取Word和PDF格式文件_网络大本营

    Java抽取Word和PDF格式文件的四种武器(1) 很多人用java进行文档操作时经常会遇到一个问题,就是如何获得word,excel,pdf等文档的内容?我研究了一下,在这里总结一下抽取word, ...

  7. java实现word转pdf在线预览格式

    java实现word转pdf在线预览格式 前段时间的项目里涉及了此功能,调研过一些方案,踩过一些坑,一一总结在此. java转pdf的方案很多,但是很多都要收费,转pdf也有一些格式方面的问题. 方案 ...

  8. JAVA POI Word转PDF convert方法 NullPointException

    JAVA POI Word转PDF convert方法 NullPointException 如果操作过通过POI操作过Word,请保证创建run之后run的值不为null,为null将在转换时报错. ...

  9. Linux系统下Java 转换Word到PDF时,结果文档内容乱码的解决方法

    本文分享在Linux系统下,通过Java 程序代码将Word转为PDF文档时,结果文档内容出现乱码该如何解决.具体可参考如下内容: 1.问题出现的背景 在Windows系统中,使用Spire.Doc ...

  10. Java 将Word转为PDF、PNG、SVG、RTF、XPS、TXT、XML

    同一文档在不同的编译或阅读环境中,需要使用特定的文档格式来打开,通常需要通过转换文档格式的方式来实现.下面将介绍在Java程序中如何来转换Word文档为其他几种常见文档格式,如PDF.图片png.sv ...

最新文章

  1. JDBC连接各种数据库方法
  2. 从MSSQL添加对Oracle的链接服务器的存储过程
  3. 信息系统定级与备案工作介绍
  4. 【Paper】2010_Distributed optimal control of multiple systems
  5. Python生成随机数总结
  6. 国科大提出FreeAnchor,新一代通用目标检测方法,代码已开源
  7. 【iVX从入门到精通 · 开篇】初始iVX——零代码的可视化编程语言
  8. 计算机组成原理-计算机硬件的基本组成
  9. Android学习路线指南-------任玉刚
  10. LM358电压跟随器
  11. termux python3.7.4_基于Termux打造Android手机渗透神器(2017-7-22更新)
  12. JS生成uuid(唯一标识符)。
  13. 连接型CRM与社交型CRM、传统漏斗型CRM有什么区别?
  14. 20190404-亥姆霍兹方程、表面等离极化激元
  15. matlab三次方程求根,如何用matlab求一元三次方程的最小正根?
  16. [系统安全] 逆向工程进阶篇之对抗逆向分析
  17. 环保大数据在环境污染防治管理中的应用
  18. 《剑指 offer : 专项突破版》 读后感
  19. vim下fcitx优化
  20. vite+vue3+TS项目引入antd-vue的问题记录

热门文章

  1. Educoder--Python正则表达式分组
  2. flutter 设置背景图片
  3. 【吉他】三步以内知道指板上任一音的方法
  4. rust沙河游戏_十款特别好玩的沙盒建造游戏,喜欢生存建造的朋友千万不要错过...
  5. 使用友盟快速分享与第三方登录注意事项
  6. Android 主题切换/换肤方案 研究(四) - qq和qq空间
  7. 招聘 | 百度自然语言处理部-实习生
  8. phpstorm集成phpunit
  9. 让键盘发出老婆的声音,键盘按键提示音工具
  10. 魔兽世界9.5人口最多服务器,魔兽世界最新全球服务器人口普查,国服早已不是世界第一人口大服...