在Linux平台下用命令行工具显示Word文档

·Antiword

·Catdoc

·wvWare[@more@]

Viewing Word files at the command line

Wednesday March 01, 2006 (09:01 AM GMT)

By:

As a Linux user, there are times when you have to play nicely with users of Windows or Mac OS -- such as when they send you Microsoft Word files. When you receive a Word file, you can either follow Richard Stallman's advice and , or bite the bullet and work with it. Modern Linux word processors -- such as , , , and -- can deal with most Word files. But if you don't want to fire up a word processor in order to read or print the document, you can turn to the command line. A handful of small but powerful Linux command line utilities make viewing, printing, and even converting Word files to another format a breeze.

Antiword

is a nifty application that can convert Word documents to plain text, PostScript, and PDF. According to the developer, conversion to DocBook XML is still experimental and doesn't always work well.

Antiword is very flexible. It can read and convert files created with Word versions 2.0 to 2003, and you can run it on multiple operating systems, including Linux, Mac OS X, RISC OS, FreeBSD, and OpenVMS. On top of that, you can set the paper size for documents converted to PostScript or PDF, include any text that was removed from the file (but which Word notoriously keeps a record of), and display any hidden text.

For the most part, you'll just want to view a Word document. To do that, you just have to type the following command:

antiword file.doc

The Word document will be converted to text and printed to the screen. If you're running Antiword in a terminal window, you'll have to scroll up to view the full text of the document. To get around this, you can pipe the output from Antiword to the less utility, which will allow you to scroll through the document page by page from the top:

antiword file.doc | less

Catdoc

Slightly less flexible than Antiword, but still useful, is , whose developer explains that "it does same work for .doc files as the Unix cat command for plain ASCII files."

While Antiword tries to retain some of the formatting of a Word file, Catdoc is a quick and dirty tool. It outputs either LaTeX or plain text, and little else. The LaTeX output leaves a lot to be desired -- it does nothing beyond adding the LaTeX formatting for tables or special characters. You'll have to add the LaTeX preamble and any other formatting code yourself.

Catdoc has some rudimentary support for tables. If it's converting a simple table, the output will be passable. If the table is more complex, say with nested elements, it won't be pretty.

To run Catdoc, type the following command:

catdoc filename.doc

You can specify the output format using the -a (text) or -t (LaTeX) option. So, to convert the Word file whitepaper.doc to text, type:

catdoc -a whitepaper.doc

As with Antiword, you can pipe the output from Catdoc to the less utility.

wvWare

is part of of wv, a library of that enables developers to code software that can read and write Word files. In fact, both AbiWord and KWord use wv for importing Word documents. wvWare can handle documents created with Word from version 6 to 2000. It converts Word 2.0 documents to text only.

Used by itself without any command line options, wvWare will convert a Word document to HTML and display the code on the screen. If you want to write the HTML to a file, use the following command:

wvWare file.doc > file.html

But you're not stuck with HTML. wvWare comes with a set of scripts that can convert Word files to a number of other formats, including plain text, HTML, LaTeX, PDF, PostScript, LaTeX DVI, and WML. These scripts are usually installed in the folder /usr/bin. You can get a list of them by typing ls /usr/bin/wv* at the command line.

If you want to convert a Word document to text, use the following command:

wvText file.doc file.txt

I've never been able to pipe the output to the less utility or a text editor. I've always had to open a file converted with wvWare in an editor or browser.

you can view Word files using wvWare and the text-mode Web browser, as detailed in the book . I've tried this hack with the text-based browsers and as well, but w3m does the best job out of the three.

To use this hack, type the following command:

wvWare -x /usr/lib/wv/wvHtml.xml file.doc | w3m -T text/html

You can also encapsulate the above command in a script if you decide to use this hack regularly. wvWare converts the Word file to HTML using the configuration file named wvHtml.xml, then pipes the output to the w3m browser.

A gotcha or two

While Antiword, Catdoc, and wvWare do a good job handling most Word files, you might run into documents that don't want to cooperate with you. I've found that these utilities sometimes can't process documents that are saved with Word's Fast Save feature, which quickly saves a file by tacking any changes to the end of the file. For example, Antiword might display the cryptic message The Small Block Depot is damaged when it encounters a Fast Saved file. This doesn't happen with all Fast Saved files, however.

As well, out of the box these programs might display garbage characters when converting Word files that use non-Latin character sets or that contain graphics. Check the documentation for the program that you're using for information on how to deal with character sets and graphics.

You don't need a word processor to view Microsoft Word documents on Linux. With the right command line apps, you can view or print those files in a flash with just a few keystrokes.

Scott Nesbitt is a Toronto-based technical writer and journalist who is a big fan of useful little command-line utilities.

Linux命令行如何编辑word文档,在Linux平台下用命令行工具显示Word文档相关推荐

  1. vc可以实现对话框里显示html文档内容,也可以显示word内容吗,VC6中使用CHtmlView在对话框控制中显示HTML文档...

    VC6中使用CHtmlView在对话框控制中显示HTML文档 2008-02-23 05:29:58来源:互联网 阅读 () 在Visual Studio 6.0中出现了一个新类CHtmlView,利 ...

  2. linux小型游戏系统设计,Linux平台下基于JAVA小游戏_设计文档.doc

    Linux课程设计报告 课题名称:<Linux平台下基于java小游戏设计> 专 业:2011级计算机科学与技术 组 长:043佘清泉 组 员:007陈威达 008陈学仁 026赖华标 0 ...

  3. flexpaper php 代码,FlexPaper Flex在线显示PDF文档的php源码下载|FlexPaper Flex在线显示PDF文档的php源码官方下载-太平洋下载中心...

    FlexPaper Flex在线显示PDF文档的php源码是Php源码频道下深受用户喜爱的软件,太平洋下载中心提供FlexPaper Flex在线显示PDF文档的php源码官方下载.FlexPaper ...

  4. linux添加物理卷编辑文件夹,Red hat Linux下的逻辑卷管理器LVM-上

    [IT168 专稿]Red hat 下的LVM 上 LVM是Logical Volume Manager(逻辑卷管理器)的简写,它为主机提供了更高层次的磁盘存储管理能力.LVM可以帮助系统管理员为应用 ...

  5. 官宣 | 效率源文档修复神器正式出道:超高性价比工具,破损文档1秒修复

    国家级专精特新企业 效率源科技又又又放大招了! 多项欧美发明专利 与157项国家授权发明加持 精雕细琢文档修复神器 效率源文档修复大师-Word专业版 强势来袭!!! Word文档损坏了怎么办? 破损 ...

  6. linux qt 获取u盘名称,QT windows平台下获取U盘 QComboBox显示U盘盘符

    在windows平台下获取U盘信息,可以调用windows API函数比较方便.本来想用qt 来写的,网上关于这方面的代码比较多,但按照提示的步骤来写的就是无法编译,我也不知道为什么.如果有知道的朋友 ...

  7. GitHub在线MySQL DDL工具gh-ost安装文档

    GitHub开源MySQL Online DDL工具gh-ost安装文档 查看GitHub开源的MySQL在线DDL工具gh-ost官方文档,以及google一圈都没有发现gh-ost的安装文档,于是 ...

  8. window cmd、linux下常用命令

    cmd命令(powershell可能不同) 1.创建文件夹 mkdir test 2.创建空文件 type nul > hello.txt 注意:echo "" > h ...

  9. 一卡通android文档,基于Android平台的校园一卡通的设计

    基于Android手机的校园一卡通设计 在信息化"十二五"规划中,浙江大学首次提出"智慧校园"的概念."智慧校园"指通过云计算,虚拟化和物联 ...

最新文章

  1. java期_java日期 时间
  2. 数据结构-简单实现二叉树的先序、中序、后序遍历(java)
  3. 转:两种转换mysql数据编码的方法-latin1转utf8
  4. WCF 4.0 REST服务解决Method Not Allowed错误
  5. FreeRADIUS 测试环境搭建
  6. 糖豆人维修服务器多长时间,服务器不稳定的《糖豆人》凭啥还这么火?只因做到了这三点...
  7. 工业炉温度计算机控制系统,热处理工业炉计算机控制系统组态王+PLC)
  8. 求近似数最值_干货|初中数学《数的开方》知识点梳理
  9. C# 基础知识复习(四)---数组
  10. 中芯国际斥资570亿元上海建12英寸晶圆厂
  11. 【算法大赛直播周】隐私保护与精准营销亦能兼得,“联邦广告”如何打破数据孤岛?
  12. linux java解压文件怎么打开,linux下面的解压缩文件的命令
  13. javascript函数定义和声明
  14. 哈尔滨矢量地图_哈尔滨地图,哈尔滨电子地图,哈尔滨地图查询,哈尔滨街景地图 - 城市吧街景地图...
  15. 189邮箱smpt服务器,189邮箱登录(常用邮箱客户端设置指南)
  16. Java面向对象4——package和import语句
  17. Codeforces 235C. Cyclical Quest 后缀自动机
  18. python绘图小dome
  19. Camera ISO、快门、光圈、曝光
  20. 适合公司用的电子邮箱哪家好?企业邮箱最全功能介绍~

热门文章

  1. 3维线程格 gpu_GPU的线程模型和内存模型
  2. WordPress关注微信公众号回复可见和阅读更多的方法
  3. 数论-FTT 和 NTT
  4. 前端 学完HTML+CSS 自己动手写出QQ官网导航栏
  5. Java程序应用实例:“你好 Java”
  6. S7-PLCSIM与WINCC 通讯【工控老鬼】
  7. [一起看海吧]2021.5.1秦皇岛北戴河之旅游记
  8. Chrome 火焰图
  9. java模拟手机浏览web_在PC上测试移动端网站和模拟手机浏览器的5大方法
  10. 【蓝桥杯】水题 基础练习 回文数 c语言