pdf2htmlEX

windows系统可执行版下载地址:
http://soft.rubypdf.com/software/pdf2htmlex-windows-version

使用方法:

  1. 将需要转换的pdf文件放入pdf2htmlEX的解压目录

  2. 使用命令提示符进入pdf2htmlEX的解压目录

cd d:\pdfex
d:

  1. 执行cmd命令调用pdf2htmlex进行转换:
pdf2htmlex --zoom 1.8 abc.pdf

  1. 执行完毕后,会在同目录下生成与pdf同名的html文件:

参数说明
–zoom 缩放倍率 (转换结果是基于pdf文件的默认设置,如果转换结果阅读体验不佳,可通过调节zoom参数进行文字缩放)

更多参数:https://github.com/coolwanglu/pdf2htmlEX/wiki/Command-Line-Options
项目github:https://github.com/coolwanglu/pdf2htmlEX

OPTIONSPages-f, --first-page <num> (Default: 1)Specify the first page to process-l, --last-page <num> (Default: last page)Specify the last page to processDimensions--zoom <ratio>, --fit-width <width>, --fit-height <height>--zoom specifies the zoom  factor  directly;  --fit-width/heightspecifies  the maximum width/height of a page, the values are inpixels.If multiple values are specified, the minimum one will be used.If none is specified, pages will be rendered as 72DPI.--use-cropbox <0|1> (Default: 1)Use CropBox instead of MediaBox for output.--hdpi <dpi>, --vdpi <dpi> (Default: 144)Specify the horizontal and vertical DPI for imagesOutput--embed <string>--embed-css <0|1> (Default: 1)--embed-font <0|1> (Default: 1)--embed-image <0|1> (Default: 1)--embed-javascript <0|1> (Default: 1)--embed-outline <0|1> (Default: 1)Specify which elements should be embedded into the  output  HTMLfile.If  switched  off,  separated files will be generated along withthe HTML file for the corresponding elements.--embed accepts a string as argument. Each letter of the  stringmust  be  one  of  `cCfFiIjJoO`, which corresponds to one of the--embed-*** switches. Lower case letters for 0  and  upper  caseletters  for  1.  For  example,  `--embed  cFIJo` means to embedeverything but CSS files and outlines.--split-pages <0|1> (Default: 0)If turned on, the content of each page is stored in a  separatedfile.This  switch is useful if you want pages to be loaded separately& dynamically -- a supporting server might be necessary.Also see --page-filename.--dest-dir <dir> (Default: .)Specify destination folder.--css-filename <filename> (Default: <none>)Specify the filename of the generated css file, if not embedded.If it's empty, the file name will be determined automatically.--page-filename <filename> (Default: <none>)Specify the filename template for pages when --split-pages is 1A %d placeholder may be included in `filename` to indicate wherethe  page  number  should  be placed. The placeholder supports alimited subset of normal numerical placeholders, including spec‐ified width and zero padding.If  `filename`  does not contain a placeholder for the page num‐ber, the page number will be inserted directly before  the  fileextension.  If the filename does not have an extension, the pagenumber will be placed at the end of the file name.If --page-filename is not specified,  <input-filename>  will  beused for the output filename, replacing the extension with .pageand adding the page number directly before the extension.Examplespdf2htmlEX --split-pages 1 foo.pdfYields page files foo1.page, foo2.page, etc.pdf2htmlEX --split-pages 1 foo.pdf --page-filename bar.bazYields page files bar1.baz, bar2.baz, etc.pdf2htmlEX --split-pages 1 foo.pdf --page-filename page%dbar.bazYields page files page1bar.baz, page2bar.baz, etc.pdf2htmlEX --split-pages 1 foo.pdf --page-filename bar%03d.bazYields page files bar001.baz, bar002.baz, etc.--outline-filename <filename> (Default: <none>)Specify the filename of  the  generated  outline  file,  if  notembedded.If it's empty, the file name will be determined automatically.--process-nontext <0|1> (Default: 1)Whether to process non-text objects (as images)--process-outline <0|1> (Default: 1)Whether to show outline in the generated HTML--printing <0|1> (Default: 1)Enable  printing  support.  Disabling this option may reduce thesize of CSS.--fallback <0|1> (Default: 0)Output in fallback mode, for better accuracy and browser compat‐ibility, but the size becomes larger.--tmp-file-size-limit <limit> (Default: -1)This  limits the total size (in KB) of the temporary files whichwill also limit the total size of the output file.  This  is  anestimate and it will stop after a page, once the total temporaryfiles size is greater than this number.-1 means no limit and is the default.Fonts--embed-external-font <0|1> (Default: 1)Specify whether the local matched fonts, for fonts not  embeddedin PDF, should be embedded into HTML.If  this  switch  is off, only font names are exported such thatweb browsers may try to find proper fonts themselves,  and  thatmight cause issues about incorrect font metrics.--font-format <format> (Default: woff)Specify the format of fonts extracted from the PDF file.--decompose-ligature <0|1> (Default: 0)Decompose ligatures. For example 'fi' -> 'f''i'.--auto-hint <0|1> (Default: 0)If  set  to 1, hints will be generated for the fonts using font‐forge.This may be preceded by --external-hint-tool.--external-hint-tool <tool> (Default: <none>)If specified, the tool will be called in order to enhanced hint‐ing for fonts, this will precede --auto-hint.The  tool  will  be called as '<tool> <in.suffix> <out.suffix>',where suffix will be the same as specified for --font-format.--stretch-narrow-glyph <0|1> (Default: 0)If set to 1, glyphs narrower  than  described  in  PDF  will  bestretched;  otherwise  space  will be padded to the right of theglyphs--squeeze-wide-glyph <0|1> (Default: 1)If set to  1,  glyphs  wider  than  described  in  PDF  will  besqueezed; otherwise it will be truncated.--override-fstype <0|1> (Default: 0)Clear the fstype bits in TTF/OTF fonts.Turn  this  on  if Internet Explorer complains about 'Permissionmust be Installable' AND you have permission to do so.--process-type3 <0|1> (Default: 0)If turned on, pdf2htmlEX will try to convert Type 3  fonts  suchthat  text can be rendered natively in HTML.  Otherwise all textwith Type 3 fonts will be rendered as image.This feature is highly experimental.Text--heps <len>, --veps <len> (Default: 1)Specify the maximum  tolerable  horizontal/vertical  offset  (inpixels).pdf2htmlEX  would try to optimize the generated HTML file movingText within this distance.--space-threshold <ratio> (Default: 0.125)pdf2htmlEX would insert a whitespace character ' ' if  the  dis‐tance  between two consecutive letters in the same line is widerthan ratio * font_size.--font-size-multiplier <ratio> (Default: 4.0)Many web browsers limit the minimum font size,  and  many  wouldround the given font size, which results in incorrect rendering.Specify a ratio greater than 1 would resolve this issue, howeverit might freeze some browsers.For some versions of Firefox, however, there will be  a  problemwhen  the  font size is too large, in which case a smaller valueshould be specified here.--space-as-offset <0|1> (Default: 0)If set to 1, space characters will be treated as offsets,  whichallows a better optimization.For  PDF  files  with  bad encodings, turning on this option maycause losing characters.--tounicode <-1|0|1> (Default: 0)A ToUnicode map may be provided for each font in PDF which indi‐cates  the  'meaning'  of the characters. However often there isbetter "ToUnicode" info in Type 0/1  fonts,  and  sometimes  theToUnicode map provided is wrong.  If this value is set to 1, theToUnicode Map is always applied, if provided in PDF, and charac‐ters may not render correctly in HTML if there are collisions.If  set to -1, a customized map is used such that rendering willbe correct in HTML (visually the same), but you may not get cor‐rect characters by select & copy & paste.If  set  to  0, pdf2htmlEX would try its best to balance the twomethods above.--optimize-text <0|1> (Default: 0)If set to 1, pdf2htmlEX will try to reduce the  number  of  HTMLelements used for text. Turn it off if anything goes wrong.Background Image--bg-format <format> (Default: png)Specify  the  background  image  format.  Run `pdf2htmlEX -v` tocheck all supported formats.PDF Protection-o, --owner-password <password>Specify owner password-u, --user-password <password>Specify user password--no-drm <0|1> (Default: 0)Override document DRM settingsTurn this on only when you have permission.Misc.--clean-tmp <0|1> (Default: 1)If switched off, intermediate files won't be cleaned in the end.--data-dir <dir> (Default: /usr/local/share/pdf2htmlEX)Specify the folder holding the manifest  and  other  files  (seebelow for the manifest file)`--tmp-dir <dir> (Default: /tmp)Specify the temporary folder to use for temporary files--css-draw <0|1> (Default: 0)Experimental and unsupported CSS drawing--debug <0|1> (Default: 0)Print debug information.Meta-v, --versionPrint copyright and version info--help Print usage information

windows系统下的 pdf2html (pdf 转html)开源工具 pdf2htmlEX 使用方法相关推荐

  1. dos命令行设置网络优先级_替代windows系统下cmd的10款命令行工具

    喜欢用linux系统的或者从事开发编程的朋友可能会经常用到命令行工具,下面会整理一些windows下命令行工具. 1.powershell 系统自带 powershell 它可以说cmd的升级版.补充 ...

  2. windows使用linux命令行工具,替代Windows系统下cmd的10款命令行工具

    喜欢用Linux系统的或者从事开发编程的朋友可能会经常用到命令行工具,下面会整理一些Windows下命令行工具. 1.powershell 系统自带 powershell 它可以说cmd的升级版.补充 ...

  3. Unix/Mac系统下的文件在Windows里打开的话,所有文字会变成一行——怎么将Unix/Mac系统下的文件转换到Windows系统下

    先交代一下遇到的问题: 本人用的是Windows系统.在学习Triangle Mesh时,想将bunny.ply等经典的PLY文件导入程序中. 但是,文件被读入程序后,所有的内容变成了一行.从而,现成 ...

  4. Windows系统下Python安装教程

    Python安装环境为Windows10系统(64) 1.Python下载 选择Python官网进行下载(Welcome to Python.org),进入网站,点击Downloads,进入下载模块, ...

  5. Windows系统下的PDF编辑工具软件-PDF编辑器下载

    PDF编辑器是一款Windows系统下的PDF编辑工具软件,它支持修改编辑PDF文件并向PDF添加文字.擦除内容.插入图片.绘制直线.加椭圆框.加矩形框和旋转PDF等功能.PDF编辑器可以让您在PDF ...

  6. 在windows系统上word转pdf

    在windows系统上word转pdf 一.前言:我在做文件转换过程中遇到的一些坑,在这里记录下,因为项目需求,需要使用html转pdf,由于itext转换质量问题(一些Css属性不起作用),导致只能 ...

  7. PHP在Windows系统下的考虑比较全面的问题!

    本节内容适用于 Windows 98/Me 以及 Windows NT/2000/XP.PHP 不能在16位平台例如 Windows 3.1 下运行.有时我们把支持 PHP 的 Windows 平台称 ...

  8. Windows系统下三十款优秀开源软件

    Windows系统下三十款优秀开源软件 1.Firefox 官方网站:http://www.getfirefox.com/ 可替换Internet Explorer 功能特点:如果你还没有使用Fire ...

  9. windows10 oracle自动备份,Windows系统下oracle 自动备份数据库

    Windows系统下oracle 自动备份数据库 1.创建批处理文件(.bat) 2.建立windows 定时任务 2.1创建任务 2.2 常规处 ->填写任务名称 2.3触发器 2.4 操作 ...

最新文章

  1. Centos7设置Tomat开机自启
  2. php接口前端安全,前端js的ajax 调用PHP写的API接口,如何卡主安全性,防止非法调用呢?...
  3. 【OpenJ_Bailian - 2299 】Ultra-QuickSort (归并排序 或 离散化 + 树状数组)
  4. 60-10-060-命令-kafka-run-class.sh
  5. java迭代是引用_在迭代递归结构时无法获得可变引用:不能一次多次借用可变引用...
  6. Apache ActiveMQ消息中间件的基本使用
  7. 组策略不让你登陆你怎么办
  8. 【BZOJ】3495: PA2010 Riddle 2-SAT算法
  9. matlab的矩阵运算,MATLAB的矩阵运算
  10. 上三角、下三角、对称矩阵
  11. 显示文件内容时显示行号命令nl
  12. web前端vue项目完整步骤。pc端
  13. 已知ip地址如何算默认网关
  14. 微信开发者工具预览二维码无法显示
  15. 清爽蓝色个人求职简历PPT模板
  16. c语言printf的%f语句,在C语言里printf是什么意思怎么
  17. SIP与RTP综合应用(转)
  18. 非金融上市企业数据,整理好的面板数据,excel或stata版本
  19. 古文观止卷七_歸去來辭_陶淵明
  20. 恐怖片 fright flick

热门文章

  1. WordPress插件开发教程1
  2. 数据分析-描述数据方法
  3. 老虎证券开放api期货合约的创建
  4. 工商银行考试计算机知识,【必读】工商银行考试内容及复习方法
  5. eclipse卡死问题优化
  6. airtest+poco多脚本、多设备批处理运行测试用例自动生成测试报告
  7. 06.12 kickstart无人值守安装
  8. 字符串选择控制语句循环语句
  9. matlab中对一个信号加指定信噪比的噪声
  10. 图像的匹配、配准、融合、拼接等概念的区别