需要源码和数据集请点赞关注收藏后评论区留言私信~~~

一、OCR文字识别简介

利用计算机自动识别字符的技术,是模式识别应用的一个重要领域。人们在生产和生活中,要处理大量的文字、报表和文本。为了减轻人们的劳动,提高处理效率,从上世纪50年代起就开始探讨文字识别方法,并研制出光学字符识别器。

OCR(Optical Character Recognition)图像文字识别是人工智能的重要分支,赋予计算机人眼的功能,使其可以看图识字,图像文字识别系统流程一般分为图像采集、文字检测、文字识别以及结果输出四部分。

二、OCR文字识别项目实战

1:数据集简介

MSRA-TD500该数据集共包含500 张自然场景图像,其分辨率在1296 ´ 864至920 ´ 1280 之间,涵盖了室内商场、标识牌、室外街道、广告牌等大多数场,文本包含中文和英文,有着不同的字体、大小和倾斜方向,部分数据集图像如下图所示。

数据集项目结构如下 分为训练集和测试集

2:项目结构

整体项目结构如下 上面是一些算法和模型比如CRAFT CRNN的定义,下面是测试代码

CRAFT算法实现文本行的检测如图下图所示。首先将完整的文字区域输入CRAFT文字检测网络,得到字符级的文字得分结果热图(Text Score)和字符级文本连接得分热图(Link Score),最后根据连通域得到每个文本行的位置

3:效果展示

开始运行代码

输出运行结果 可以放入不同图片进行测试

三、代码

部分代码如下 需要全部代码和数据集请点赞关注收藏后评论区留言私信~~~

"""This script demonstrates how to train the model
on the SynthText90 using multiple GPUs."""
# pylint: disable=invalid-name
import datetime
import argparse
import math
import random
import string
import functools
import itertools
import os
import tarfile
import urllib.requestimport numpy as np
import cv2
import imgaug
import tqdm
import tensorflow as tfimport keras_ocr# pylint: disable=redefined-outer-name
def get_filepaths(data_path, split):"""Get the list of filepaths for a given split (train, val, or test)."""with open(os.path.join(data_path, f'mnt/ramdisk/max/90kDICT32px/annotation_{split}.txt'),'r') as text_file:filepaths = [os.path.join(data_path, 'mnt/ramdisk/max/90kDICT32px',line.split(' ')[0][2:]) for line in text_file.readlines()]return filepaths# pylint: disable=redefined-outer-name
def download_extract_and_process_dataset(data_path):"""Download and extract the synthtext90 dataset."""archive_filepath = os.path.join(data_path, 'mjsynth.tar.gz')extraction_directory = os.path.join(data_path, 'mnt')if not os.path.isfile(archive_filepath) and not os.path.isdir(extraction_directory):print('Downloading the dataset.')urllib.request.urlretrieve("https://www.robots.ox.ac.uk/~vgg/data/text/mjsynth.tar.gz",archive_filepath)if not os.path.isdir(extraction_directory):print('Extracting files.')with tarfile.open(os.path.join(data_path, 'mjsynth.tar.gz')) as tfile:tfile.extractall(data_path)def get_image_generator(filepaths, augmenter, width, height):"""Get an image generator for a list of SynthText90 filepaths."""filepaths = filepaths.copy()for filepath in itertools.cycle(filepaths):text = filepath.split(os.sep)[-1].split('_')[1].lower()image = cv2.imread(filepath)if image is None:print(f'An error occurred reading: {filepath}')image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)image = keras_ocr.tools.fit(image,width=width,height=height,cval=np.random.randint(low=0, high=255, size=3).astype('uint8'))if augmenter is not None:image = augmenter.augment_image(image)if filepath == filepaths[-1]:random.shuffle(filepaths)yield image, textif __name__ == '__main__':parser = argparse.ArgumentParser(description='Process some integers.')parser.add_argument('--model_id',default='recognizer',help='The name to use for saving model checkpoints.')parser.add_argument('--data_path',default='.',help='The path to the directory containing the dataset and where we will put our logs.')parser.add_argument('--logs_path',default='./logs',help=('The path to where logs and checkpoints should be stored. ''If a checkpoint matching "model_id" is found, training will resume from that point.'))parser.add_argument('--batch_size', default=16, help='The training batch size to use.')parser.add_argument('--no-file-verification', dest='verify_files', action='store_false')parser.set_defaults(verify_files=True)args = parser.parse_args()weights_path = os.path.join(args.logs_path, args.model_id + '.h5')csv_path = os.path.join(args.logs_path, args.model_id + '.csv')download_extract_and_process_dataset(args.data_path)with tf.distribute.MirroredStrategy().scope():recognizer = keras_ocr.recognition.Recognizer(alphabet=string.digits +string.ascii_lowercase,height=31,width=200,stn=False,optimizer=tf.keras.optimizers.RMSprop(),weights=None)if os.path.isfile(weights_path):print('Loading saved weights and creating new version.')dt_string = datetime.datetime.now().isoformat()weights_path = os.path.join(args.logs_path, args.model_id + '_' + dt_string + '.h5')csv_path = os.path.join(args.logs_path, args.model_id + '_' + dt_string + '.csv')recognizer.model.load_weights(weights_path)augmenter = imgaug.augmenters.Sequential([imgaug.augmenters.Multiply((0.9, 1.1)),imgaug.augmenters.GammaContrast(gamma=(0.5, 3.0)),imgaug.augmenters.Invert(0.25, per_channel=0.5)])os.makedirs(args.logs_path, exist_ok=True)training_filepaths, validation_filepaths = [get_filepaths(data_path=args.data_path, split=split) for split in ['train', 'val']]if args.verify_files:assert all(os.path.isfile(filepath) forfilepath in tqdm.tqdm(training_filepaths + validation_filepaths,desc='Checking filepaths.')), 'Some files appear to be missing.'(training_image_generator, training_steps), (validation_image_generator, validation_steps) = [(get_image_generator(filepaths=filepaths,augmenter=augmenter,width=recognizer.model.input_shape[2],height=recognizer.model.input_shape[1],), math.ceil(len(filepaths) / args.batch_size))for filepaths, augmenter in [(training_filepaths, augmenter), (validation_filepaths, None)]]training_generator, validation_generator = [tf.data.Dataset.from_generator(functools.partial(recognizer.get_batch_generator,image_generator=image_generator,batch_size=args.batch_size),output_types=((tf.float32, tf.int64, tf.float64, tf.int64), tf.float64),output_shapes=((tf.TensorShape([None, 31, 200, 1]), tf.TensorShape([None, recognizer.training_model.input_shape[1][1]]), tf.TensorShape([None,1]), tf.TensorShape([None,1])), tf.TensorShape([None, 1])))for image_generator in [training_image_generator, validation_image_generator]]callbacks = [tf.keras.callbacks.EarlyStopping(monitor='val_loss',min_delta=0,patience=10,restore_best_weights=False),tf.keras.callbacks.ModelCheckpoint(weights_path, monitor='val_loss', save_best_only=True),tf.keras.callbacks.CSVLogger(csv_path)]recognizer.training_model.fit(x=training_generator,steps_per_epoch=training_steps,validation_steps=validation_steps,validation_data=validation_generator,callbacks=callbacks,epochs=1000,)
"""This script is what was used to generate the
backgrounds.zip and fonts.zip files.
"""
# pylint: disable=invalid-name,redefined-outer-name
import json
import urllib.request
import urllib.parse
import concurrent
import shutil
import zipfile
import glob
import osimport numpy as np
import tqdm
import cv2import keras_ocrif __name__ == '__main__':fonts_commit = 'a0726002eab4639ee96056a38cd35f6188011a81'fonts_sha256 = 'e447d23d24a5bbe8488200a058cd5b75b2acde525421c2e74dbfb90ceafce7bf'fonts_source_zip_filepath = keras_ocr.tools.download_and_verify(url=f'https://github.com/google/fonts/archive/{fonts_commit}.zip',cache_dir='.',sha256=fonts_sha256)shutil.rmtree('fonts-raw', ignore_errors=True)with zipfile.ZipFile(fonts_source_zip_filepath) as zfile:zfile.extractall(path='fonts-raw')retained_fonts = []sha256s = []basenames = []# The blacklist includes fonts that, at least for the English alphabet, were found# to be illegible (e.g., thin fonts) or render in unexpected ways (e.g., mathematics# fonts).blacklist = ['AlmendraDisplay-Regular.ttf', 'RedactedScript-Bold.ttf', 'RedactedScript-Regular.ttf','Sevillana-Regular.ttf', 'Mplus1p-Thin.ttf', 'Stalemate-Regular.ttf', 'jsMath-cmsy10.ttf','Codystar-Regular.ttf', 'AdventPro-Thin.ttf', 'RoundedMplus1c-Thin.ttf','EncodeSans-Thin.ttf', 'AlegreyaSans-ThinItalic.ttf', 'AlegreyaSans-Thin.ttf','FiraSans-Thin.ttf', 'FiraSans-ThinItalic.ttf', 'WorkSans-Thin.ttf','Tomorrow-ThinItalic.ttf', 'Tomorrow-Thin.ttf', 'Italianno-Regular.ttf','IBMPlexSansCondensed-Thin.ttf', 'IBMPlexSansCondensed-ThinItalic.ttf','Lato-ExtraLightItalic.ttf', 'LibreBarcode128Text-Regular.ttf','LibreBarcode39-Regular.ttf', 'LibreBarcode39ExtendedText-Regular.ttf','EncodeSansExpanded-ExtraLight.ttf', 'Exo-Thin.ttf', 'Exo-ThinItalic.ttf','DrSugiyama-Regular.ttf', 'Taviraj-ThinItalic.ttf', 'SixCaps.ttf', 'IBMPlexSans-Thin.ttf','IBMPlexSans-ThinItalic.ttf', 'AdobeBlank-Regular.ttf','FiraSansExtraCondensed-ThinItalic.ttf', 'HeptaSlab[wght].ttf', 'Karla-Italic[wght].ttf','Karla[wght].ttf', 'RalewayDots-Regular.ttf', 'FiraSansCondensed-ThinItalic.ttf','jsMath-cmex10.ttf', 'LibreBarcode39Text-Regular.ttf', 'LibreBarcode39Extended-Regular.ttf','EricaOne-Regular.ttf', 'ArimaMadurai-Thin.ttf', 'IBMPlexSerif-ExtraLight.ttf','IBMPlexSerif-ExtraLightItalic.ttf', 'IBMPlexSerif-ThinItalic.ttf', 'IBMPlexSerif-Thin.ttf','Exo2-Thin.ttf', 'Exo2-ThinItalic.ttf', 'BungeeOutline-Regular.ttf', 'Redacted-Regular.ttf','JosefinSlab-ThinItalic.ttf', 'GothicA1-Thin.ttf', 'Kanit-ThinItalic.ttf', 'Kanit-Thin.ttf','AlegreyaSansSC-ThinItalic.ttf', 'AlegreyaSansSC-Thin.ttf', 'Chathura-Thin.ttf','Blinker-Thin.ttf', 'Italiana-Regular.ttf', 'Miama-Regular.ttf', 'Grenze-ThinItalic.ttf','LeagueScript-Regular.ttf', 'BigShouldersDisplay-Thin.ttf', 'YanoneKaffeesatz[wght].ttf','BungeeHairline-Regular.ttf', 'JosefinSans-Thin.ttf', 'JosefinSans-ThinItalic.ttf','Monofett.ttf', 'Raleway-ThinItalic.ttf', 'Raleway-Thin.ttf', 'JosefinSansStd-Light.ttf','LibreBarcode128-Regular.ttf']for filepath in tqdm.tqdm(sorted(glob.glob('fonts-raw/**/**/**/*.ttf')),desc='Filtering fonts.'):sha256 = keras_ocr.tools.sha256sum(filepath)basename = os.path.basename(filepath)# We check the sha256 and filenames because some of the fonts# in the repository are duplicated (see TRIVIA.md).if sha256 in sha256s or basename in basenames or basename in blacklist:continuesha256s.append(sha256)basenames.append(basename)retained_fonts.append(filepath)retained_font_families = set([filepath.split(os.sep)[-2] for filepath in retained_fonts])added = []with zipfile.ZipFile(file='fonts.zip', mode='w') as zfile:for font_family in tqdm.tqdm(retained_font_families, desc='Saving ZIP file.'):# We want to keep all the metadata files plus# the retained font files. And we don't want# to add the same file twice.files = [input_filepath for input_filepath in glob.glob(f'fonts-raw/**/**/{font_family}/*')if input_filepath not in added and(input_filepath in retained_fonts or os.path.splitext(input_filepath)[1] != '.ttf')]added.extend(files)for input_filepath in files:zfile.write(filename=input_filepath,arcname=os.path.join(*input_filepath.split(os.sep)[-2:]))print('Finished saving fonts file.')# pylint: disable=line-too-longurl = ('https://commons.wikimedia.org/w/api.php?action=query&generator=categorymembers&gcmtype=file&format=json''&gcmtitle=Category:Featured_pictures_on_Wikimedia_Commons&prop=imageinfo&gcmlimit=50&iiprop=url&iiurlwidth=1024')gcmcontinue = Nonemax_responses = 300responses = []for responseCount in tqdm.tqdm(range(max_responses)):current_url = urlif gcmcontinue is not None:current_url += f'&continue=gcmcontinue||&gcmcontinue={gcmcontinue}'with urllib.request.urlopen(url=current_url) as response:current = json.loads(response.read())responses.append(current)gcmcontinue = None if 'continue' not in current else current['continue']['gcmcontinue']if gcmcontinue is None:breakprint('Finished getting list of images.')# We want to avoid animated images as well as icon files.image_urls = []for response in responses:image_urls.extend([page['imageinfo'][0]['thumburl'] for page in response['query']['pages'].values()])image_urls = [url for url in image_urls if url.lower().endswith('.jpg')]shutil.rmtree('backgrounds', ignore_errors=True)os.makedirs('backgrounds')assert len(image_urls) == len(set(image_urls)), 'Duplicates found!'with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:futures = [executor.submit(keras_ocr.tools.download_and_verify,url=url,cache_dir='./backgrounds',verbose=False) for url in image_urls]for _ in tqdm.tqdm(concurrent.futures.as_completed(futures), total=len(futures)):passfor filepath in glob.glob('backgrounds/*.JPG'):os.rename(filepath, filepath.lower())print('Filtering images by aspect ratio and maximum contiguous contour.')image_paths = np.array(sorted(glob.glob('backgrounds/*.jpg')))def compute_metrics(filepath):image = keras_ocr.tools.read(filepath)aspect_ratio = image.shape[0] / image.shape[1]contour, _ = keras_ocr.tools.get_maximum_uniform_contour(image, fontsize=40)area = cv2.contourArea(contour) if contour is not None else 0return aspect_ratio, areametrics = np.array([compute_metrics(filepath) for filepath in tqdm.tqdm(image_paths)])filtered_paths = image_paths[(metrics[:, 0] < 3 / 2) & (metrics[:, 0] > 2 / 3) &(metrics[:, 1] > 1e6)]detector = keras_ocr.detection.Detector()paths_with_text = [filepath for filepath in tqdm.tqdm(filtered_paths) if len(detector.detect(images=[keras_ocr.tools.read_and_fit(filepath, width=640, height=640)])[0]) > 0]filtered_paths = np.array([path for path in filtered_paths if path not in paths_with_text])filtered_basenames = list(map(os.path.basename, filtered_paths))basename_to_url = {os.path.basename(urllib.parse.urlparse(url).path).lower(): urlfor url in image_urls}filtered_urls = [basename_to_url[basename.lower()] for basename in filtered_basenames]assert len(filtered_urls) == len(filtered_paths)removed_paths = [filepath for filepath in image_paths if filepath not in filtered_paths]for filepath in removed_paths:os.remove(filepath)with open('backgrounds/urls.txt', 'w') as f:f.write('\n'.join(filtered_urls))with zipfile.ZipFile(file='backgrounds.zip', mode='w') as zfile:for filepath in tqdm.tqdm(filtered_paths.tolist() + ['backgrounds/urls.txt'],desc='Saving ZIP file.'):zfile.write(filename=filepath, arcname=os.path.basename(filepath.lower()))

创作不易 觉得有帮助请点赞关注收藏~~~

【Keras+计算机视觉+Tensorflow】OCR文字识别实战(附源码和数据集 超详细必看)相关推荐

  1. 【Android App】给三维的地球仪贴上动物贴纸实战(附源码和演示 超详细必看)

    需要源码和图片集请点赞关注收藏后评论区留言~~~ 一.纹理贴图 给三维物体穿衣服的动作,通常叫做给三维图形贴图,更专业地说叫纹理渲染. 渲染纹理的过程主要由三大项操作组成,分别说明如下: (1)启用纹 ...

  2. 【Android App】二维码的讲解及生成属于自己的二维码实战(附源码和演示 超详细必看)

    需要全部代码请点赞关注收藏后评论区留言~~~ 一.二维码基本内容介绍 条形码只能表达十几位数字编码,无法表示更复杂的数据. 二维码在二维方格上描出一个个黑点,从而表达更丰富的信息. 二维码早已在手机A ...

  3. 【Android App】检查手机连接WiFi信息以及扫描周围WiFi的讲解及实战(附源码和演示 超详细必看)

    需要全部代码请点赞关注收藏后评论区留言私信~~~ 一.检查是否连接WiFi以及输出WiFi信息 传统的定位方式不适用于室内的垂直定位,原因如下: (1)卫星定位要求没有障碍物遮挡,它在户外比较精准,在 ...

  4. 【Android App】实战项目之仿拼多多的直播带货(附源码和演示 超详细必看)

    需要源码请点赞关注收藏后评论区留言私信~~~ 近年来电商业态发生了不小的改变,传统的电商平台把商品分门别类,配上精美的图文说明供消费者挑选,新潮的电商平台则请来明星网红,开启直播秀向广大粉丝推销商品, ...

  5. 【Android App】实战项目之仿微信的视频通话(附源码和演示 超详细必看)

    需要源码请点赞关注收藏后评论区留言私信~~~ 虽然手机出现许多年了,它具备的功能也越来越丰富,但是最基本的通话功能几乎没有变化.从前使用固定电话的时候,通话就是听声音:如今使用最新的智能手机,通话仍旧 ...

  6. 【Keras+计算机视觉+Tensorflow】DCGAN对抗生成网络在MNIST手写数据集上实战(附源码和数据集 超详细)

    需要源码和数据集请点赞关注收藏后评论区留言私信~~~ 一.生成对抗网络的概念 生成对抗网络(GANs,Generative Adversarial Nets),由Ian Goodfellow在2014 ...

  7. python屏幕文字识别_学会python就是如此任性,15行代码搞定图片文字识别,附源码...

    python作为一门高级编程语言,它的定位是优雅.明确和简单.阅读Python编写的代码感觉像在阅读英语一样,这让使用者可以专注于解决问题而不是去搞明白语言本身.Python虽然是基于C语言编写,但是 ...

  8. python 深度学习源码_「深度学习」用TensorFlow实现人脸识别(附源码,快速get技能)...

    本文将会带你使用python码一个卷积神经网络模型,实现人脸识别,操作难度比较低,动手跟着做吧,让你的电脑认出你那帅气的脸. 由于代码篇幅较长,而且最重要的缩进都没了,建议直接打开源码或者点击分享-& ...

  9. 【Android App】人脸识别中扫描识别二维码实战解析(附源码和演示 超详细)

    需要源码请点赞关注收藏后评论区留言私信~~~ 一.扫描识别二维码 不仅可以利用zxing库生成二维码,同样利用zxing库可以扫描二维码并解析得到原始文本,此时除了给build.gradle添加如下一 ...

最新文章

  1. linux防火墙伦堂,「linux专栏」自从看了这篇文章,我彻底搞懂了selinux和防火墙...
  2. 倒计时或按任意键返回首页_客服魔方更新:首页界面大改版,催拍催付操作更方便...
  3. MySQL非空约束(NOT NULL)
  4. scheduled每天下午1点执行一次_在Spring Boot项目中使用@Scheduled注解实现定时任务...
  5. C语言中main函数的研究,以及对代码断点调试的研究(王爽老师 汇编语言)
  6. Windows平台下Makefile学习笔记
  7. php提交之前验证数据ajax提示,在通过Ajax请求提交之前使用jQuery进行表单验证
  8. 计算机中那些事儿(四):我眼中的虚拟技术
  9. WAN killer
  10. 基于NFC的Android读写软件,NFC读写(android代码)
  11. 欧姆龙cp1h指令讲解_欧姆龙PLC编程指令含义及其用法
  12. ISO27001信息安全管理体系的建立评估五大注意事项
  13. pcm5102a解码芯片音质评测_聊一款售价两万九千元的国产解码器——声韵织女星AURALiC VEGA G1...
  14. 一维卷积神经网络应用,二维卷积神经网络原理
  15. Unity Timeline自定义轨道
  16. 心理测试单机小软件,成人心理测评系统(单机版)
  17. 什么软件可以测试宝宝身高体重,如何测试宝宝身高体重
  18. web接口测试之GET与POST请求
  19. 【数据库】ER模型的简单例子
  20. 华为 博士 实习_如何看待华为招聘生物博士实习生?

热门文章

  1. 12306抢票系统的NB解析
  2. 中国越野车和皮卡市场趋势报告、技术动态创新及市场预测
  3. 机器学习 api_开发人员会喜欢的10种机器学习API
  4. DbVisualizer 8.0.11 出现乱码问题
  5. 代数系统,二元运算,半群,含幺半群,群
  6. Aria2离线下载搭建
  7. php date 加月_php如何使时间增加一个月
  8. 微信小程序手把手教你实现类似Android中ViewPager控件效果
  9. 笔记本电脑实现内外网双通成功经验分享(内网用有线,外网用无线)
  10. 2021年5月23日哈工大scir笔试