http://blog.csdn.net/hk_jh/article/details/8961449

主题 wxPython

pytesser是谷歌OCR开源项目的一个模块,在python中导入这个模块即可将图片中的文字转换成文本。

pytesser  调用了  tesseract。在python中调用pytesser模块,pytesser又用tesseract识别图片中的文字。

下面是整个过程的实现步骤:

1、首先要在code.google.com下载pytesser。 https://code.google.com/p/pytesser/downloads/detail?name=pytesser_v0.0.1.zip

这个是免安装的,可以放在python安装文件夹的\Lib\site-packages\  下直接使用

pytesser里包含了tesseract.exe和英语的数据包(默认只识别英文),还有一些示例图片,所以解压缩后即可使用。

可通过以下代码测试:

>>> from pytesser import *
>>> image = Image.open('fnord.tif')  # Open image object using PIL
>>> print image_to_string(image)     # Run tesseract.exe on image
fnord
>>> print image_file_to_string('fnord.tif')
fnord
from pytesser import *
#im = Image.open('fnord.tif')
#im = Image.open('phototest.tif')
#im = Image.open('eurotext.tif')
im = Image.open('fonts_test.png')
text = image_to_string(im)
print text

注:该模块需要PIL库的支持。

2、解决识别率低的问题

可以增强图片的显示效果,或者将其转换为黑白的,这样可以使其识别率提升不少:

enhancer = ImageEnhance.Contrast(image1)
image2 = enhancer.enhance(4)

可以再对image2调用 image_to_string识别

3、识别其他语言

tesseract是一个命令行下运行的程序,参数如下:

tesseract  imagename outbase [-l  lang]  [-psm N]  [configfile...]

imagename是输入的image的名字

outbase是输出的文本的名字,默认为outbase.txt

-l  lang  是定义要识别的的语言,默认为英文

详见 http://tesseract-ocr.googlecode.com/svn-history/r725/trunk/doc/tesseract.1.html

通过以下步骤可以识别其他语言:

(1)、下载其他语言数据包:

https://code.google.com/p/tesseract-ocr/downloads/list

将语言包放入pytesser的tessdata文件夹下

接下来修改pytesser.py的参数,下面是一个例子:

"""OCR in Python using the Tesseract engine from Google
http://code.google.com/p/pytesser/
by Michael J.T. O'Kelly
V 0.0.2, 5/26/08"""import Image
import subprocess
import os
import StringIOimport util
import errorstesseract_exe_name = 'dlltest' # Name of executable to be called at command line
scratch_image_name = "temp.bmp" # This file must be .bmp or other Tesseract-compatible format
scratch_text_name_root = "temp" # Leave out the .txt extension
_cleanup_scratch_flag = True  # Temporary files cleaned up after OCR operation
_language = "" # Tesseract uses English if language is not given
_pagesegmode = "" # Tesseract uses fully automatic page segmentation if psm is not given (psm is available in v3.01)_working_dir = os.getcwd()def call_tesseract(input_filename, output_filename, language, pagesegmode):"""Calls external tesseract.exe on input file (restrictions on types),outputting output_filename+'txt'"""current_dir = os.getcwd()error_stream = StringIO.StringIO()try:os.chdir(_working_dir)args = [tesseract_exe_name, input_filename, output_filename]if len(language) > 0:args.append("-l")args.append(language)if len(str(pagesegmode)) > 0:args.append("-psm")args.append(str(pagesegmode))try:proc = subprocess.Popen(args)except (TypeError, AttributeError):proc = subprocess.Popen(args, shell=True)retcode = proc.wait()if retcode!=0:error_text = error_stream.getvalue()errors.check_for_errors(error_stream_text = error_text)finally:  # Guarantee that we return to the original directoryerror_stream.close()os.chdir(current_dir)def image_to_string(im, lang = _language, psm = _pagesegmode, cleanup = _cleanup_scratch_flag):"""Converts im to file, applies tesseract, and fetches resulting text.If cleanup=True, delete scratch files after operation."""try:util.image_to_scratch(im, scratch_image_name)call_tesseract(scratch_image_name, scratch_text_name_root, lang, psm)result = util.retrieve_result(scratch_text_name_root)finally:if cleanup:util.perform_cleanup(scratch_image_name, scratch_text_name_root)return resultdef image_file_to_string(filename, lang = _language, psm = _pagesegmode, cleanup = _cleanup_scratch_flag, graceful_errors=True):"""Applies tesseract to filename; or, if image is incompatible and graceful_errors=True,converts to compatible format and then applies tesseract.  Fetches resulting text.If cleanup=True, delete scratch files after operation. Parameter lang specifies used language.If lang is empty, English is used. Page segmentation mode parameter psm is available in Tesseract 3.01.psm values are:0 = Orientation and script detection (OSD) only.1 = Automatic page segmentation with OSD.2 = Automatic page segmentation, but no OSD, or OCR3 = Fully automatic page segmentation, but no OSD. (Default)4 = Assume a single column of text of variable sizes.5 = Assume a single uniform block of vertically aligned text.6 = Assume a single uniform block of text.7 = Treat the image as a single text line.8 = Treat the image as a single word.9 = Treat the image as a single word in a circle.10 = Treat the image as a single character."""try:try:call_tesseract(filename, scratch_text_name_root, lang, psm)result = util.retrieve_result(scratch_text_name_root)except errors.Tesser_General_Exception:if graceful_errors:im = Image.open(filename)result = image_to_string(im, cleanup)else:raisefinally:if cleanup:util.perform_cleanup(scratch_image_name, scratch_text_name_root)return resultif __name__=='__main__':im = Image.open('phototest.tif')text = image_to_string(im, cleanup=False)print texttext = image_to_string(im, psm=2, cleanup=False)print texttry:text = image_file_to_string('fnord.tif', graceful_errors=False)except errors.Tesser_General_Exception, value:print "fnord.tif is incompatible filetype.  Try graceful_errors=True"#print valuetext = image_file_to_string('fnord.tif', graceful_errors=True, cleanup=False)print "fnord.tif contents:", texttext = image_file_to_string('fonts_test.png', graceful_errors=True)print texttext = image_file_to_string('fonts_test.png', lang="eng", psm=4, graceful_errors=True)print text

这个是source里面提供的,其实若只要识别其他语言只要添加一个language参数就行了,下面是我的例子:

"""OCR in Python using the Tesseract engine from Google
http://code.google.com/p/pytesser/
by Michael J.T. O'Kelly
V 0.0.1, 3/10/07"""import Image
import subprocess
import util
import errorstesseract_exe_name = 'tesseract' # Name of executable to be called at command line
scratch_image_name = "temp.bmp" # This file must be .bmp or other Tesseract-compatible format
scratch_text_name_root = "temp" # Leave out the .txt extension
cleanup_scratch_flag = True  # Temporary files cleaned up after OCR operationdef call_tesseract(input_filename, output_filename, language):"""Calls external tesseract.exe on input file (restrictions on types),outputting output_filename+'txt'"""args = [tesseract_exe_name, input_filename, output_filename, "-l", language]proc = subprocess.Popen(args)retcode = proc.wait()if retcode!=0:errors.check_for_errors()def image_to_string(im, cleanup = cleanup_scratch_flag, language = "eng"):"""Converts im to file, applies tesseract, and fetches resulting text.If cleanup=True, delete scratch files after operation."""try:util.image_to_scratch(im, scratch_image_name)call_tesseract(scratch_image_name, scratch_text_name_root,language)text = util.retrieve_text(scratch_text_name_root)finally:if cleanup:util.perform_cleanup(scratch_image_name, scratch_text_name_root)return textdef image_file_to_string(filename, cleanup = cleanup_scratch_flag, graceful_errors=True, language = "eng"):"""Applies tesseract to filename; or, if image is incompatible and graceful_errors=True,converts to compatible format and then applies tesseract.  Fetches resulting text.If cleanup=True, delete scratch files after operation."""try:try:call_tesseract(filename, scratch_text_name_root, language)text = util.retrieve_text(scratch_text_name_root)except errors.Tesser_General_Exception:if graceful_errors:im = Image.open(filename)text = image_to_string(im, cleanup)else:raisefinally:if cleanup:util.perform_cleanup(scratch_image_name, scratch_text_name_root)return textif __name__=='__main__':im = Image.open('phototest.tif')text = image_to_string(im)print texttry:text = image_file_to_string('fnord.tif', graceful_errors=False)except errors.Tesser_General_Exception, value:print "fnord.tif is incompatible filetype.  Try graceful_errors=True"print valuetext = image_file_to_string('fnord.tif', graceful_errors=True)print "fnord.tif contents:", texttext = image_file_to_string('fonts_test.png', graceful_errors=True)print text

在调用image_to_string函数时,只要加上相应的language参数就可以了,如简体中文最后一个参数即为 chi_sim, 繁体中文chi_tra,

也就是下载的语言包的 XXX.traineddata 文件的名字XXX,如下载的中文包是 chi_sim.traineddata, 参数就是chi_sim :

text = image_to_string(self.im, language = 'chi_sim')

至此,图片识别就完成了。

额外附加一句:有可能中文识别出来了,但是乱码,需要相应地将text转换为你所用的中文编码方式,如:

text.decode("utf8")就可以了

wxPython利用pytesser模块实现图片文字识别相关推荐

  1. python批量识别图片中文字_利用Python批量进行图片文字识别

    实现逻辑 1. 批量获取图片的路径 2. 通过调用百度OCR接口批量识别图片 3. 将返回值写入txt 实现过程 1. 安装百度的Python SDK pip install baidu-aip 2. ...

  2. python批量图片文字识别_利用Python批量进行图片文字识别

    实现逻辑 1. 批量获取图片的路径 2. 通过调用百度OCR接口批量识别图片 3. 将返回值写入txt 实现过程 1. 安装百度的Python SDK pip install baidu-aip 2. ...

  3. Python模块介绍使用:EasyOCR快速实现图片文字识别

    hello,大家好,我是wangzirui32,今天我们来学习如何使用EasyOCR快速实现图片文字识别,开始学习吧! 1. 什么是OCR 2. 安装EasyOCR 安装命令: pip install ...

  4. 吴恩达《机器学习》第十八章:图片文字识别OCR

    文章目录 十八.应用实例:图片文字识别OCR 18.1 问题描述和流程图 18.2 滑动窗口 18.3 获取大量数据和人工数据 18.4 上限分析:下一步工作 十八.应用实例:图片文字识别OCR 18 ...

  5. 吴恩达机器学习(十五)—— 应用实例:图片文字识别

    应用实例:图片文字识别 1. 问题描述和流水线 2. 滑动窗口 3. 获取大量数据:人工数据合成 4. 上限分析:流水线的哪个模块最有改进价值   学习图片文字识别的应用实例要做的事情: 展示一个复杂 ...

  6. 2021-02-21 Python Easyocr 图片文字识别

    Python Easyocr 图片文字识别 前段时间做了车牌识别相关的内容分享,参看: 车牌识别(1)-车牌数据集生成 车牌识别(2)-搭建车牌识别模型 今天给大家分享一个简单的OCR文本识别工具:e ...

  7. 吴恩达《Machine Learning》精炼笔记 12:大规模机器学习和图片文字识别 OCR

    作者 | Peter 编辑 | AI有道 系列文章: 吴恩达<Machine Learning>精炼笔记 1:监督学习与非监督学习 吴恩达<Machine Learning>精 ...

  8. Android 图片文字识别DEMO(基于百度OCR)

    前言   OCR 是 Optical Character Recognition 的缩写,翻译为光学字符识别,指的是针对印刷体字符,采用光学的方式将纸质文档中的文字转换成为黑白点阵的图像文件,通过识别 ...

  9. python存数据库c读数据库喷码加工_python图片文字识别

    Python语言读取Marc后处理文件基础知识_材料科学_工程科技_专业资料.Python语言简介,Marc计算结果文件读取,焊接模拟后处理实例 基于python 的焊接后处理知识要点: ? ?... ...

  10. (python)实现一个简单的图片文字识别脚本

    文章目录 截图 文字识别## 访问剪切板 总结 快毕业了,除了准备答辩之外,就是看看书,各种瞎晃~ 那么,这两天在看书的时候遇到这么个问题: 首先,部分电子版的书籍是以扫描图片的形式展现的,在阅读过程 ...

最新文章

  1. Java项目:在线点餐系统(java+Springboot+Maven+mybatis+Vue+mysql+Redis)
  2. vue 使用scss
  3. Redis和Memcache的区别是什么
  4. 第七届“数学、计算机与生命科学交叉研究” 青年学者论坛
  5. spring中@param和mybatis中@param使用区别
  6. 前端学习(2555):vue的核心概念事件
  7. PostgreSQL 分页——示例
  8. 海康9800平台linux的sdk,流媒体项目外包海康9800平台sdk适配
  9. 7647 余数相同问题
  10. LeetCode 887. Super Egg Drop
  11. 杭电4508湫湫系列故事——减肥记I
  12. 安卓手机qq怎么看密友值_qq密友值在哪看
  13. 企业微信开发----H5发送表单请求到企业微信内部审核
  14. java tic tac toe_java – 对Tic Tac Toe的建议
  15. 22478计算机代码,数字2247代表啥意思 2247数字意思
  16. AnySDK吉祥物征名活动开始啦!
  17. python房价预测模型_python随机森林房价预测
  18. jacob 给word 指定位置添加超级链接
  19. java类索引_java中索引的分类,图片详解
  20. 申请国外博士后的好网站

热门文章

  1. c语言规定棋盘大小的,求数据结构C语言大神们解释下马踏棋盘程序
  2. android内存溢出案例
  3. React目录结构详细解析
  4. 因为一个YYYY-MM-dd的Bug,我被老板骂的狗血淋头!
  5. 打造狂拽炫酷的主流自定义侧滑控件(仿酷狗和QQ5.0)
  6. path和classpath的区别
  7. 性能测试--jmeter中HTTP Cookie管理器的使用【18】
  8. 如何保证战略落地_博雅视野丨大健康战略时代,全龄康养如何落地?
  9. tkintergui-grid布局内容2
  10. scrapy[skp]快速入门