python中文文本处理_python简单文本处理的方法
本文实例讲述了python简单文本处理的方法。分享给大家供大家参考。具体如下:
由于有多线程的影响,c++项目打印出来的时间顺序不一致,导致不太好在excel中统计,故使用python写了段脚本来解决之。涉及到如下方面
1. txt文本的读取,utf8的处理
2. 字符串的基本操作
3. dict的基本操作
4. list(数组)的基本操作
#!/usr/bin/python
#print "Hello World"
str_seperator = "=================================================================================="
timePointName = ["enter OpenNextImage at",#0
"enter OpenImage at",#1
"In OpenImage send On_ImageRefresh at",#2
"leave OpenImage at",#3
"leave OpenNextImage at",#4
"enter LoadImage at",#5
"decode began at",#6
"enter DrawClient at",#7
"leave DrawClient at",#8
"decode end at",#9
"in LoadImage send On_ImageRefresh at",#10
"leave loadImage at",#11
"second enter DrawClient at",#12
"second leave DrawClient at" #13
]
itemNumber= 0;
avgTotal = 0; #13-0
avgFirstDraw = 0; #8-2
avgLoadImage = 0; #11-5
avgSecondDraw = 0;#13-10
fobj = open("F:\log.txt","r")
imageTimeSta = {}
dic = {}
path = ""
idx = 0
for line in fobj:
idx = idx + 1
if idx == 1:
line = line[3:]
else:
pass
line = line.strip()
line = line.decode("utf-8").encode("gbk")
if line == str_seperator:
if path == "":
pass
else:
imageTimeSta[path] = dic
dic = {}
path = ""
continue
tabIndex = line.find('\t')
if tabIndex == -1:
path = line
print path
continue
tabLastIndex = line.rfind('\t')
name = line[0:tabIndex]
time = int(line[tabLastIndex + 1:])
if name in dic:
dic["second " + name] = time
else:
dic[name] = time
fobj.close()
itemNumber = len(imageTimeSta)
keys = imageTimeSta.keys();
for (k,dic) in imageTimeSta.iteritems():
avgTotal += dic[timePointName[13]] - dic[timePointName[0]];
avgFirstDraw += dic[timePointName[8]] - dic[timePointName[2]];
avgLoadImage += dic[timePointName[11]] - dic[timePointName[5]];
avgSecondDraw += dic[timePointName[13]] - dic[timePointName[10]];
print 'avgTotal',avgTotal / float(itemNumber)
print 'avgFirstDraw',avgFirstDraw / float(itemNumber)
print 'avgLoadImage',avgLoadImage / float(itemNumber)
print 'avgSecondDraw',avgSecondDraw / float(itemNumber)
#print imageTimeSta
log.txt文件如下:
enter OpenNextImage at 5124
D:\pics\测试图片\解码性能对比用图\jpeg\较小图\2.JPG
enter OpenImage at 5124
In OpenImage send On_ImageRefresh at 5124
enter LoadImage at 5124
leave OpenImage at 5124
leave OpenNextImage at 5124
decode began at 5124
enter DrawClient at 5140
leave DrawClient at 5155
decode end at 5265
in LoadImage send On_ImageRefresh at 5265
leave loadImage at 5265
enter DrawClient at 5280
leave DrawClient at 5327
==================================================================================
enter OpenNextImage at 6280
D:\pics\测试图片\解码性能对比用图\jpeg\较小图\3.JPG
enter OpenImage at 6280
In OpenImage send On_ImageRefresh at 6280
enter LoadImage at 6280
leave OpenImage at 6296
leave OpenNextImage at 6296
decode began at 6296
enter DrawClient at 6296
leave DrawClient at 6312
decode end at 6437
in LoadImage send On_ImageRefresh at 6437
enter DrawClient at 6437
leave loadImage at 6452
leave DrawClient at 6499
==================================================================================
enter OpenNextImage at 7265
D:\pics\测试图片\解码性能对比用图\jpeg\较小图\4.JPG
enter OpenImage at 7265
In OpenImage send On_ImageRefresh at 7265
leave OpenImage at 7265
leave OpenNextImage at 7265
enter LoadImage at 7265
decode began at 7265
enter DrawClient at 7265
leave DrawClient at 7296
decode end at 7421
in LoadImage send On_ImageRefresh at 7421
enter DrawClient at 7421
leave loadImage at 7437
leave DrawClient at 7483
==================================================================================
enter OpenNextImage at 8062
D:\pics\测试图片\解码性能对比用图\jpeg\较小图\5.JPG
enter OpenImage at 8062
In OpenImage send On_ImageRefresh at 8062
leave OpenImage at 8062
leave OpenNextImage at 8062
enter LoadImage at 8062
decode began at 8062
enter DrawClient at 8062
leave DrawClient at 8077
decode end at 8202
in LoadImage send On_ImageRefresh at 8202
enter DrawClient at 8202
leave DrawClient at 8265
leave loadImage at 8280
==================================================================================
enter OpenNextImage at 8811
D:\pics\测试图片\解码性能对比用图\jpeg\较小图\6.JPG
enter OpenImage at 8811
In OpenImage send On_ImageRefresh at 8811
leave OpenImage at 8811
leave OpenNextImage at 8811
enter LoadImage at 8811
decode began at 8811
enter DrawClient at 8811
leave DrawClient at 8843
decode end at 8968
in LoadImage send On_ImageRefresh at 8968
leave loadImage at 8968
enter DrawClient at 8968
leave DrawClient at 9030
==================================================================================
enter OpenNextImage at 9515
D:\pics\测试图片\解码性能对比用图\jpeg\较小图\7.JPG
enter OpenImage at 9515
In OpenImage send On_ImageRefresh at 9515
leave OpenImage at 9515
leave OpenNextImage at 9515
enter LoadImage at 9515
decode began at 9530
enter DrawClient at 9530
leave DrawClient at 9546
decode end at 9671
in LoadImage send On_ImageRefresh at 9671
enter DrawClient at 9671
leave loadImage at 9671
leave DrawClient at 9733
==================================================================================
enter OpenNextImage at 10171
D:\pics\测试图片\解码性能对比用图\jpeg\较小图\8.JPG
enter OpenImage at 10171
In OpenImage send On_ImageRefresh at 10171
leave OpenImage at 10171
leave OpenNextImage at 10171
enter LoadImage at 10171
decode began at 10186
enter DrawClient at 10186
leave DrawClient at 10202
decode end at 10311
in LoadImage send On_ImageRefresh at 10311
leave loadImage at 10311
enter DrawClient at 10311
leave DrawClient at 10374
==================================================================================
enter OpenNextImage at 10811
D:\pics\测试图片\解码性能对比用图\jpeg\较小图\9.JPG
enter OpenImage at 10811
In OpenImage send On_ImageRefresh at 10811
enter LoadImage at 10811
leave OpenImage at 10811
leave OpenNextImage at 10811
enter DrawClient at 10811
decode began at 10811
leave DrawClient at 10843
decode end at 10952
in LoadImage send On_ImageRefresh at 10952
leave loadImage at 10952
enter DrawClient at 10952
leave DrawClient at 11030
==================================================================================
enter OpenNextImage at 11452
D:\pics\测试图片\解码性能对比用图\jpeg\较小图\10.JPG
enter OpenImage at 11452
In OpenImage send On_ImageRefresh at 11452
leave OpenImage at 11452
leave OpenNextImage at 11452
enter LoadImage at 11452
decode began at 11452
enter DrawClient at 11468
leave DrawClient at 11483
decode end at 11593
in LoadImage send On_ImageRefresh at 11593
enter DrawClient at 11593
leave loadImage at 11608
leave DrawClient at 11655
==================================================================================
enter OpenNextImage at 12077
enter DrawClient at 12077
leave DrawClient at 12108
==================================================================================
enter OpenNextImage at 13124
D:\pics\测试图片\解码性能对比用图\jpeg\较小图\1.jpg
enter OpenImage at 13124
In OpenImage send On_ImageRefresh at 13124
leave OpenImage at 13124
leave OpenNextImage at 13124
enter LoadImage at 13124
decode began at 13124
enter DrawClient at 13139
leave DrawClient at 13155
decode end at 13358
in LoadImage send On_ImageRefresh at 13358
leave loadImage at 13358
enter DrawClient at 13358
leave DrawClient at 13405
==================================================================================
希望本文所述对大家的Python程序设计有所帮助。
本文原创发布php中文网,转载请注明出处,感谢您的尊重!
python中文文本处理_python简单文本处理的方法相关推荐
- python的strftime函数_Python简单格式化时间的方法【strftime函数】
本文实例讲述了Python简单格式化时间的方法,分享给大家供大家参考,具体如下: walker经常用到当前时间和相对时间,用来统计程序执行的效率,简单记一下,便于copy. >>> ...
- python的sqlite3示例_Python简单操作sqlite3的方法示例
让Python更加充分的使用Sqlite3 我最近在涉及大量数据处理的项目中频繁使用 sqlite3.我最初的尝试根本不涉及任何数据库,所有的数据都将保存在内存中,包括字典查找.迭代和条件等查询.这很 ...
- python中文相似度_python文本相似度计算
步骤分词.去停用词 词袋模型向量化文本 TF-IDF模型向量化文本 LSI模型向量化文本 计算相似度 理论知识 两篇中文文本,如何计算相似度?相似度是数学上的概念,自然语言肯定无法完成,所有要把文本转 ...
- python空行拼接字符串_python基础---文本和字符串操作
一.文本操作 打开读取文本 (查) lock_file = open('username_lock.txt', 'r+') #r+ 读写 lock_list =lock_file.readlines( ...
- python中英文字频率_python统计文本字符串里单词出现频率的方法
本文实例讲述了python统计文本字符串里单词出现频率的方法.分享给大家供大家参考.具体实现方法如下: # word frequency in a text # tested with Python2 ...
- python实现文本编辑器_Python实现文本编辑器功能实例详解
这篇文章主要介绍了Python实现的文本编辑器功能,结合实例形式详细分析了基于wxpython实现文本编辑器所需的功能及相关实现技巧,需要的朋友可以参考下 本文实例讲述了Python实现的文本编辑器功 ...
- python单词个数统计_Python 统计文本中单词的个数
1.读文件,通过正则匹配 def statisticWord(): line_number = 0 words_dict = {} with open (r'D:\test\test.txt',enc ...
- python中文分词统计_python 实现中文分词统计
总是看到别人用Python搞各种统计,前端菜鸟的我也来尝试了一把.有各种语义分析库在,一切好像并不是很复杂.不过Python刚开始看,估计代码有点丑. 一.两种中文分词开发包 thulac (http ...
- python提取数组元素_python简单获取数组元素个数的方法
python简单获取数组元素个数的方法 更新时间:2015年07月13日 17:54:46 作者:pythoner 这篇文章主要介绍了python简单获取数组元素个数的方法,实例分析了Python中l ...
最新文章
- Codeforces Round #599A~D题解
- Amy Mcdonald - This is the Life
- php滑动换视频,php工具类之【视频变换类】
- 程序员的灯下黑:能认识自己吗?
- 知名K12公司资深运营谢涵博:线上教育产业瓶颈该如何突破?
- 做海外运营?这125条核心数据你需要Get
- python分组函数_Python中如何按列分组和按自己的函数汇总
- Mongo使用navicat解除14天限制
- mc服务器tps优化,LaggRemover——降低延迟/优化TPS/内存
- 现在加入Web前端学习还有市场吗?自己是否适合学习前端
- 【Express】—post传递参数
- php微信开发视频教程_PHP微信开发视频资源推荐
- 可自主二次开发的微信云控客服crm系统软件(带源码)
- 利用多拨技术将100M宽带免费扩展到1000M
- MySQL —— 查询升序和降序
- 包装严重的 IT 圈,作为面试官,是如何甄别应聘者呢?
- WordPress网站建设中实用的简繁切换工具
- python飞机大战爆炸效果实现_Python飞机大战实战项目案例
- 使用JAVA将m3u8转换为mp4格式
- 换皮后贴吧玩家反馈整理