C#遍历文件读取Word内容以及使用BackgroundWorker对象打造平滑进度条
下面来看看程序的具体实现步骤。
首先是读取指定的目录并遍历其中的文件。这个功能很简单,我们只需要通过一个foreach循环遍历DirectoryInfo对象中的所有文件即可。在本例中,我们定义了一个文件夹选择控件,当用户点击Select按钮时弹出对话框用户指定程序要遍历的目录,将目录字符串存储在文本框中。点击Go按钮时开始遍历文件夹中的所有文件。程序将按行输出目录中Word文件名,不包含扩展名。
2 using System.Collections.Generic;
3 using System.ComponentModel;
4 using System.Data;
5 using System.Drawing;
6 using System.Linq;
7 using System.Text;
8 using System.Windows.Forms;
9 using System.IO;
10
11 namespace ListFileName
12 {
13 public partial class ListFileName : Form
14 {
15 protected DirectoryInfo dirFolder = null;
16
17 public ListFileName()
18 {
19 InitializeComponent();
20 }
21
22 private void SelectPath()
23 {
24 if (folderBrowserDialog.ShowDialog() == DialogResult.OK)
25 {
26 this.tbPath.Text = folderBrowserDialog.SelectedPath;
27 }
28 }
29
30 // Double click the path textbox to open the folder browser dialog.
31 private void tbPath_DoubleClick(object sender, EventArgs e)
32 {
33 SelectPath();
34 }
35
36 // Open the folder browser dialog.
37 private void btSelect_Click(object sender, EventArgs e)
38 {
39 SelectPath();
40 }
41
42 // Start to run.
43 private void btGo_Click(object sender, EventArgs e)
44 {
45 this.btGo.Enabled = this.btSelect.Enabled = this.tbPath.Enabled = false;
46 this.tbOutput.Text = "";
47 string folderPath = this.tbPath.Text.Trim();
48
49 if (folderPath.Length == 0 || !Directory.Exists(folderPath))
50 {
51 MessageBox.Show("Please select a valid folder.", "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
52 return;
53 }
54 dirFolder = new DirectoryInfo(folderPath);
55
56 //Traversing file
57 foreach (FileInfo file in dirFolder.GetFiles())
58 {
59 // if the file type is word and not the word temp file.
60 if (file.Extension.IndexOf("doc") > 0 && file.Name.IndexOf("~$") < 0)
61 {
62 this.tbOutput.Text += file.Name.Substring(0, file.Name.LastIndexOf('.')).Trim() += "\r\n";
63 }
64 }
65
66 if (this.tbOutput.Text.Length > 2)
67 {
68 this.tbOutput.Text = this.tbOutput.Text.Substring(0, this.tbOutput.Text.Length - 2);
69 }
70
71 this.btGo.Enabled = this.btSelect.Enabled = this.tbPath.Enabled = false;
72 }
73 }
74 }
然后是增加读取Word文件的功能。这个需要使用COM组件,在工程中添加对Word COM组件的引用。
程序只选择读取Word文件的前三行,下面是具体的代码。
2
3 protected Word.Application app = null;
4 protected Word.Document doc = null;
5
6 private string ReadTextFromWord(object fileName)
7 {
8 string sRst = string.Empty;
9 object isReadOnly = true;
10 object unknow = Type.Missing;
11 app.Visible = false;
12
13 if (app != null)
14 {
15
16 try
17 {
18 // Open a word document with read only mode.
19 doc = app.Documents.Open(ref fileName,
20 ref unknow, ref isReadOnly, ref unknow, ref unknow, ref unknow,
21 ref unknow, ref unknow, ref unknow, ref unknow, ref unknow,
22 ref unknow, ref unknow, ref unknow, ref unknow, ref unknow);
23
24 // Read the second paragraph text - Production's Engligh name.
25 sRst = (doc.Paragraphs.Count > 1 ? doc.Paragraphs[2].Range.Text.Trim() : "") + "\t";
26 // Read the third paragraph text - Production's Chinese name.
27 sRst += (doc.Paragraphs.Count > 2 ? doc.Paragraphs[3].Range.Text.Trim() : "") + "\t";
28 }
29 catch (Exception)
30 {
31 }
32 finally
33 {
34 // Close opened word document.
35 app.ActiveDocument.Close(ref unknow, ref unknow, ref unknow);
36 }
37 }
38
39 return sRst;
40 }
注意打开Word文档时可以以只读方式打开,这样打开的速度会稍微快一点,并且关闭的时候不会出现是否保存文件的对话框。任何情况下只要打开了Word文档,一定记住使用完后要将其关闭,并且在退出程序时也要将Word进程一并关掉。
2 private void ListFileName_FormClosed(object sender, FormClosedEventArgs e)
3 {
4 if (app != null)
5 {
6 object unknow = Type.Missing;
7 object saveChanges = Word.WdSaveOptions.wdDoNotSaveChanges;
8 app.Quit(ref saveChanges, ref unknow, ref unknow);
9 }
10 }
最后再来看看如何实现平滑进度条。这个需要使用到BackgroundWorker对象,该对象允许程序在多线程下执行,我们可以将界面UI和后台线程分开,这样当程序在执行一个时间较长的任务时不会导致主界面处于等待状态,用户仍然可以操作主界面上的元素,使程序看起来比较平滑。BackgroundWorker对象定义了一些属性和委托用来处理多线程间的信息同步,诸如后台线程在执行过程中如何通知主界面的线程更新进度条值等。
2
3 worker.WorkerReportsProgress = true;
4
5 worker.DoWork += new DoWorkEventHandler(worker_DoWork);
6 worker.ProgressChanged += new ProgressChangedEventHandler(worker_ProgressChanged);
7 worker.RunWorkerCompleted += new RunWorkerCompletedEventHandler(worker_RunWorkerCompleted);
在work_DoWork中调用具体执行的方法,worker_ProgressChanged方法用来实时更新进度条值,worker_RunWorkerCompleted方法用来处理当后台线程执行完操作后要处理的事情,如更新界面UI,将进度条的当前值更改为100%等。可以看看下面完整的程序代码:
2 using System.Collections.Generic;
3 using System.ComponentModel;
4 using System.Data;
5 using System.Drawing;
6 using System.Linq;
7 using System.Text;
8 using System.Windows.Forms;
9 using System.IO;
10 using Word = Microsoft.Office.Interop.Word;
11
12 namespace ListFileName
13 {
14 public partial class ListFileName : Form
15 {
16 protected BackgroundWorker worker = new BackgroundWorker();
17 protected Word.Application app = null;
18 protected Word.Document doc = null;
19
20 protected DirectoryInfo dirFolder = null;
21 protected int iFileCount = 0;
22 protected string output = string.Empty;
23
24 public ListFileName()
25 {
26 InitializeComponent();
27 Application.EnableVisualStyles();
28
29 worker.WorkerReportsProgress = true;
30
31 worker.DoWork += new DoWorkEventHandler(worker_DoWork);
32 worker.ProgressChanged += new ProgressChangedEventHandler(worker_ProgressChanged);
33 worker.RunWorkerCompleted += new RunWorkerCompletedEventHandler(worker_RunWorkerCompleted);
34 }
35
36 #region Private methods
37 private void SelectPath()
38 {
39 if (folderBrowserDialog.ShowDialog() == DialogResult.OK)
40 {
41 this.tbPath.Text = folderBrowserDialog.SelectedPath;
42 }
43 }
44
45 private string ReadTextFromWord(object fileName)
46 {
47 string sRst = string.Empty;
48 object isReadOnly = true;
49 object unknow = Type.Missing;
50 app.Visible = false;
51
52 if (app != null)
53 {
54
55 try
56 {
57 // Open a word document with read only mode.
58 doc = app.Documents.Open(ref fileName,
59 ref unknow, ref isReadOnly, ref unknow, ref unknow, ref unknow,
60 ref unknow, ref unknow, ref unknow, ref unknow, ref unknow,
61 ref unknow, ref unknow, ref unknow, ref unknow, ref unknow);
62
63 // Read the second paragraph text - Production's Engligh name.
64 sRst = (doc.Paragraphs.Count > 1 ? doc.Paragraphs[2].Range.Text.Trim() : "") + "\t";
65 // Read the third paragraph text - Production's Chinese name.
66 sRst += (doc.Paragraphs.Count > 2 ? doc.Paragraphs[3].Range.Text.Trim() : "") + "\t";
67 }
68 catch (Exception)
69 {
70 }
71 finally
72 {
73 // Close opened word document.
74 app.ActiveDocument.Close(ref unknow, ref unknow, ref unknow);
75 }
76 }
77
78 return sRst;
79 }
80
81 private int DoWorkAsync(int start, BackgroundWorker worker, DoWorkEventArgs e)
82 {
83 string folderPath = this.tbPath.Text.Trim();
84 int percentComplete = 0;
85
86 //Traversing file
87 foreach (FileInfo file in dirFolder.GetFiles())
88 {
89 start++;
90 percentComplete++;
91 // if the file type is word and not the word temp file.
92 if (file.Extension.IndexOf("doc") > 0 && file.Name.IndexOf("~$") < 0)
93 {
94 output += file.Name.Substring(0, file.Name.LastIndexOf('.')).Trim();
95 // If allow to read word file
96 if (this.chkReadWord.Checked)
97 {
98 output += "\t" + ReadTextFromWord(file.FullName); // Read from word.
99 }
100 output += "\r\n";
101 }
102
103 // Update processbar value.
104 worker.ReportProgress(percentComplete % iFileCount);
105 }
106
107 if (output.Length > 2)
108 {
109 output = output.Substring(0, output.Length - 2);
110 }
111
112 return start;
113 }
114 #endregion
115
116 #region Events
117
118 void worker_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
119 {
120 this.tbPath.Enabled = this.btSelect.Enabled = this.btGo.Enabled = true;
121 progressBar.Value = iFileCount;
122
123 if (e.Error != null)
124 {
125 this.lblHint.Text = e.Error.Message;
126 }
127 else
128 {
129 this.lblHint.Text = "Complete!";
130 MessageBox.Show("Complete!", "Ok", MessageBoxButtons.OK, MessageBoxIcon.Information);
131 }
132 }
133
134 // Update processbar value.
135 void worker_ProgressChanged(object sender, ProgressChangedEventArgs e)
136 {
137 progressBar.Value = e.ProgressPercentage;
138 this.tbOutput.Text = output;
139 }
140
141 void worker_DoWork(object sender, DoWorkEventArgs e)
142 {
143 int start = (int)e.Argument;
144 e.Result = DoWorkAsync(start, (BackgroundWorker)sender, e);
145 }
146
147 // Double click the path textbox to open the folder browser dialog.
148 private void tbPath_DoubleClick(object sender, EventArgs e)
149 {
150 SelectPath();
151 }
152
153 // Open the folder browser dialog.
154 private void btSelect_Click(object sender, EventArgs e)
155 {
156 SelectPath();
157 }
158
159 // Start to run.
160 private void btGo_Click(object sender, EventArgs e)
161 {
162 this.btGo.Enabled = this.btSelect.Enabled = this.tbPath.Enabled = false;
163
164 if (app == null && chkReadWord.Checked)
165 {
166 app = new Microsoft.Office.Interop.Word.Application();
167 }
168 this.tbOutput.Text = output = "";
169 string folderPath = this.tbPath.Text.Trim();
170
171 if (folderPath.Length == 0 || !Directory.Exists(folderPath))
172 {
173 MessageBox.Show("Please select a valid folder.", "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
174 return;
175 }
176 dirFolder = new DirectoryInfo(folderPath);
177 this.progressBar.Maximum = iFileCount = dirFolder.GetFiles().Count();
178 this.progressBar.Value = 0;
179 this.lblHint.Text = "Please wait, processing...";
180 this.lblHint.Visible = progressBar.Visible = true;
181 worker.RunWorkerAsync(0);
182
183 //this.tbOutput.Text = "";
184 //DirectoryInfo dirFolder = new DirectoryInfo(folderPath);
185
186 ////Traversing file
187 //foreach (FileInfo file in dirFolder.GetFiles())
188 //{
189 // // if the file type is word.
190 // if (file.Extension.IndexOf("doc") > 0 && file.Name.IndexOf("~$") < 0)
191 // {
192 // this.tbOutput.Text += file.Name.Substring(0, file.Name.LastIndexOf('.')).Trim() + "\t" + ReadTextFromWord(file.FullName);
193 // }
194
195 // this.tbOutput.Text += "\r\n";
196 //}
197 }
198
199 // Select all in output textbox.
200 private void tbOutput_KeyDown(object sender, KeyEventArgs e)
201 {
202 if (e.Control && e.KeyValue == 65)
203 {
204 this.tbOutput.SelectAll();
205 }
206 }
207
208 // Kill the word process when closed main form.
209 private void ListFileName_FormClosed(object sender, FormClosedEventArgs e)
210 {
211 if (app != null)
212 {
213 object unknow = Type.Missing;
214 object saveChanges = Word.WdSaveOptions.wdDoNotSaveChanges;
215 app.Quit(ref saveChanges, ref unknow, ref unknow);
216 }
217 }
218 #endregion
219 }
220 }
完整代码下载
C#遍历文件读取Word内容以及使用BackgroundWorker对象打造平滑进度条相关推荐
- java poi无法读取word_poi,word_POI读取word内容的疑问,poi,word,java - phpStudy
POI读取word内容的疑问 两片内容相同的word,poi分割成XWPFRun集合的时候,发现相同的段落内容有空行. word1打印的段落内容 word2打印的段落内容 代码如下 /** 替换段落中 ...
- python导入word转换的html,python如何转换word格式、读取word内容、转成html
# python如何转换word格式.读取word内容.转成html? import docx from win32com import client as wc # 首先将doc转换成docx wo ...
- 使用phpoffice/phpword读取word内容
一:phpoffice/phpword安装 composer require phpoffice/phpword phpword的GitHub地址:https://github.com/PHPOffi ...
- 文件的上传下载功能的实现(包括进度条)[telerik控件]
文件的上传下载功能的实现(包括进度条) 1.准备工作 首先我们需要Telerik控件,数据库,上传文件文件夹. Telerik控件: RadUpload.RadProgressManager.RadP ...
- vba访问服务器中txt文件,vba读取word内容 vba读取txt文件
excel vba 读取 word 指定字符 Sub 按钮1()Dim myPath As StringSet Wdapp = CreateObject("Word.Application")Wdap ...
- python读取word内容复制粘贴,Python读取word文本操作详解
本文研究的主要问题时Python读取word文本操作,分享了相关概念和实现代码,具体如下. 一,docx模块 Python可以利用python-docx模块处理word文档,处理方式是面向对象的.也就 ...
- php读取word格式,php读取word格式 phpword 读取word内容
PHP怎么读取WORD文档 传就传吧,后台传什么文件都可以,原样保留在文件夹下面就行了,天台的直接指向它,只要前台安装了相应的程序就能打开,例如指向一个WORD的代码可以是:你可能觉得这样很不好,前台 ...
- python循环遍历word,Python遍历文件写入word
最近腾讯开放平台上架管理的比较严,需要软件著作权,申请软件著作权又需要五万行项目代码,想想就头大,正好最近在学习Python,好歹也是个程序员,这种重复性的工作,当然是要用程序解决咯,就写了一个遍历项 ...
- python遍历excel_python遍历文件读取并写结果到excel
简单demo,记录一下方便以后使用 # -*- coding: utf-8 -*- # encoding:utf-8 import os import xlsxwriter source_file_p ...
最新文章
- vs2010 SQL Server Compact 3.5出错
- 计算机系统结构研究分支,“计算机系统结构” 课程教学探讨[J] 电子科技大学.doc...
- 基于 EventBridge 构建 SaaS 应用集成方案
- MySQL的Redolog/Undolog/binlog日志
- java课程之团队开发冲刺1.4
- c++第二次上机实验项目二
- bzoj1296 [SCOI2009]粉刷匠 区间dp+背包
- 谷歌与苹果结盟,Facebook万亿帝国梦碎?
- DNS服务器的基本工作
- 素数筛(快速筛)-爱拉托斯特尼筛法+欧拉筛
- -i https://pypi.tuna.tsinghua.edu.cn/simple -U
- 计网实验1--配置路由
- 重读《从菜鸟到测试架构师》--构建测试
- 模糊层次分析法matlab,求三角模糊数层次分析法(FEAHP)模型计算的MATLAB程
- java编程练习题四
- Alphafold2蛋白质三维结构预测AI工作站配置
- 数据库分类和负载均衡方案
- fuz--2128(最长子串)
- Go的安装及环境变量的配置
- ArcGIS Enterprise部署介绍
热门文章
- 【AutoML】归一化(Normalization)方法如何进行自动学习和配置
- 【每周NLP论文推荐】 NLP中命名实体识别从机器学习到深度学习的代表性研究
- 全球及中国汽车零部件信息化行业需求预测与竞争战略规划报告2022年
- NSIS安装制作程序
- chrome Native Client 让你可以使用 C 以及 C++ 语言开发 Web 应用
- java笔记15-日期类
- layui对json数据的格式要求
- 猫猫学iOS之小知识iOS启动动画_Launch Screen的运用
- 谷歌雇程序员提升开源安全
- 【PAT】1009. Product of Polynomials (25)