word python_word

#########################docx文件############################

'''

.docx文件有很多结构,有3种不同的类型来表示

在最高一层,Document对象表示整个文档

Document对象包含一个Paragraph对象的列表,表示文档中的段落,以回车键为准

每个Paragraph对象包含一个Run对象的列表

'''

#########################读取Word文档########################

import docx

doc=docx.Document(r'C:\Users\shenlu\Desktop\demo.docx')

len(doc.paragraphs)

doc.paragraphs[0].text

doc.paragraphs[1].text

len(doc.paragraphs[1].runs)

doc.paragraphs[1].runs[0].text

doc.paragraphs[1].runs[1].text

doc.paragraphs[1].runs[2].text

doc.paragraphs[1].runs[3].text

########################从.docx文件中取得完整的文本########################

import docx

def getText(filename):

doc=docx.Document(filename)

fullText=[]

for para in doc.paragraphs:

#########每一段有缩进#############

###fullText.append(' '+para.text)

fullText.append(para.text)

###段落之间增加空行,return '\n\n'.join(fullText)

return '\n'.join(fullText)

########################从.docx文件中取得完整的文本########################

import readDocx

print (readDocx.getText('demo.docx'))

########################设置Paragraph和Run对象的样式########################

'''

对于Word文档,有3种类型的样式：

段落样式可以应用于Paragraph对象,字符样式可以应用于Run对象

链接的样式可以应用于这两种对象

默认Word样式的字符串如下：

'Normal' 'BodyText' 'BodyText2' 'BodyText3' 'Caption' 'Heading1' 'Heading2' 'Heading3' 'Heading4'

'Heading5' 'Heading6' 'Heading7' 'Heading8' 'Heading9' 'IntenseQuote' 'List' 'List2' 'List3'

'ListBullet' 'ListBullet2' 'ListBullet3' 'ListContinue' 'ListContinue2' 'ListContinue3' 'ListNumber' 'ListNumber2' 'ListNumber3'

'ListParagraph' 'MacroText' 'NoSpacing' 'Quote' 'Subtitle' 'TOCHeading' 'Title'

'''

Run对象的text属性

属性描述

bold 文本以粗体出现

italic 文本以斜体出现

underline 文本带下划线

strike 文本带删除线

double_strike 文本带双删除线

all_caps 文本以大写首字母出现

small_caps 文本以大写首字母出现,小写字母小两个点

shadow 文本带阴影

outline 文本以轮廓线出现,而不是实心

rtl 文本从右至左书写

imprint 文本以刻入页面的方式出现

emboss 文本以凸出页面的方式出现

'''

########################################################################

import docx

doc=docx.Document(r'C:\Users\shenlu\Desktop\demo.docx')

doc.paragraphs[0].text

doc.paragraphs[0].style

doc.save(r'C:\Users\shenlu\Desktop\demo.docx')

doc.paragraphs[0].style='Heading 1'

>>> doc.paragraphs[1].style

_ParagraphStyle('No Spacing') id: 124515664

doc.paragraphs[1].text

(doc.paragraphs[1].runs[0].text,doc.paragraphs[1].runs[1].text,doc.paragraphs[1].runs[2].text,doc.paragraphs[1].runs[3].text)

doc.paragraphs[1].runs[1].underline=True

doc.paragraphs[1].runs[3].underline=True

doc.save(r'C:\Users\shenlu\Desktop\demo.docx')

#################################写入Word文档################################

import docx

doc=docx.Document()

doc.add_heading('Header 0',0)

doc.add_heading('Header 1',1)

doc.add_heading('Header 2',2)

doc.add_heading('Header 3',3)

doc.add_heading('Header 4',4)

doc.add_paragraph('Hello world!')

doc.add_picture(r'C:\Users\shenlu\Desktop\DSCN0859.jpg',width=docx.shared.Inches(1),height=docx.shared.Cm(4))

paraObj1=doc.add_paragraph('This is a second paragraph.')

paraObj2=doc.add_paragraph('This is a yet another paragraph.')

paraObj1.add_run('This text is being added to the second paragraph.')

doc.save(r'C:\Users\shenlu\Desktop\helloworld.docx')

#################################添加换行符和换页符################################

import docx

from docx.enum.text import WD_BREAK

doc=docx.Document()

doc.add_paragraph('This is on the first page!')

#doc.paragraphs[0].runs[0].add_break() ###换行

doc.paragraphs[0].runs[0].add_break(WD_BREAK.PAGE) ###换页

doc.add_paragraph('This is on the second page!')

doc.save(r'C:\Users\shenlu\Desktop\twoPage.docx')

#################################定制邀请函################################

import docx,os

txtcontent=open(r'C:\Users\shenlu\Desktop\guests.txt','rb')

lines=txtcontent.readlines()

for line in lines:

print line

line=line.replace('\r\n','')

doc=docx.Document()

doc.add_paragraph('It would be a pleasure to have the company of')

doc.add_paragraph(line)

doc.add_paragraph('at 11010 memory laue on the euening of')

doc.add_paragraph('April lst')

doc.add_paragraph('at 7 o\'clock')

doc.save(os.path.join(r'C:\Users\shenlu\Desktop',line+'.docx'))

txtcontent.close()

#############################################################################

http://nostarch.com/automatestuff/

word python_word_python相关推荐

使用hello word写小说
2021-11-12 使用hello word写小说
html 实现动态在线预览word、excel、pdf等文件
word+excle+pdf表格在线浏览通过iframe直接引用微软提供的方法 <iframe src='https://view.officeapps.live.com/op/view.as ...
办公word，ppt，excel问题
20220401 要改变图片的位置,把其布局改成浮于文字上方,就可以随意移动图片的位置了改成浮于文字上方后,图片后面不能再写内容,可以先在图片写完内容之后再把图片设置为浮于文字上方这样就可以解决问 ...
python读取word
处理前先去掉文件中不要的表格在企查查查询企业信息的时候,得到了一些word文件,里面有些控股企业的数据放在表格里,需要我们将其提取出来. 查看全文 http://www.taodudu.cc/ne ...
GloVe:另一种Word Embedding方法
GloVe 和其他模型的关系当看到GloVe拍脑门找到loglog函数的时候,就觉得和word2vec中应用language model有几分类似. 其实确有千丝万缕的联系的,推一推,会发现两者的 ...
打开word出现“您正试图运行的函数包含有宏或需要宏语言支持的内容。”的解决方法
1.Word选项→加载bai项→管理→转到--,转到→把勾勾都取消→确定即可. 2.工具→宏->安全性→可靠发行商,将下方的选择框中的钩清空,再将宏安全性改为最高. 3.可能是初次安装Offic ...
.net导出到Excel与Word中（带上下标）
//输出到excel的函数,可直接copy到 cs页面 private void OutExcel(GridView dg, string name) { dg.Vis ...
[Quiz]竞赛题目 Word Trace
一.竞赛题目 Problem Statement You are given a String[] grid representing a rectangular grid of letters ...
python批量pdf转word,python批量实现Word文件转换为PDF文件
本文为大家分享了python批量转换Word文件为PDF文件的具体方法,供大家参考,具体内容如下 1.目的通过万能的Python把一个目录下的所有Word文件转换为PDF文件. 2.遍历目录作者总 ...

word python_word_python

word python_word_python相关推荐

最新文章

热门文章