本文写成时主要参考了[1,2], 后面加了一些自己收集的,不过大家都在更新,所以区别不是很大~


[2015-PAMI-Overview]Text Detection and Recognition in Imagery: A Survey[paper]

[2014-Front.Comput.Sci-Overview]Scene Text Detection and Recognition: Recent Advances and Future Trends[paper]


[2018-arxiv]TextBoxes++: ASingle-Shot Oriented Scene Text Detector[paper]

[2018-arxiv]FOTS: Fast OrientedText Spotting with a Unified Network[paper]

[2018-AAAI] PixelLink: DetectingScene Text via Instance Segmentation[paper]

[2017-arXiv]Fused Text Segmentation Networks for Multi-oriented Scene Text Detection[paper]

[2017-arXiv]WeText: Scene Text Detection under Weak Supervision[paper]

[2017-ICCV]Single Shot Text Detector with Regional Attention[pdf]

[2017-ICCV]WordSup: Exploiting Word Annotations for Character based Text Detection[paper]

[2017-arXiv]R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection[paper]

[2017-CVPR]EAST: An Efficient and Accurate Scene Text Detector [paper][code]

[2017-arXiv]Cascaded Segmentation-Detection Networks for Word-Level Text Spotting[paper]

[2017-arXiv]Deep Direct Regression for Multi-Oriented Scene Text Detection[paper]

[2017-CVPR]Detecting oriented text in natural images by linking segments [paper][code]

[2017-CVPR]Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection[paper]

[2017-arXiv]Arbitrary-Oriented Scene Text Detection via Rotation Proposals [paper]

[2017-AAAI]TextBoxes: A Fast Text Detector with a Single Deep Neural Network[paper][code]

[2016-arXiv]Accurate Text Localization in Natural Image with Cascaded Convolutional TextNetwork [paper]

[2016-arXiv]DeepText : A Unified Framework for Text Proposal Generation and Text Detectionin Natural Images [paper] [data]

[2017-PR]TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild [paper] [code]

[2016-arXiv] SceneText Detection via Holistic, Multi-Channel Prediction [paper]

[2016-CVPR] CannyText Detector: Fast and Robust Scene Text Localization Algorithm [paper]

[2016-CVPR]Synthetic Data for Text Localisation in Natural Images [paper] [data][code]

[2016-ECCV]Detecting Text in Natural Image with Connectionist Text Proposal Network[paper][demo][code]

[2016-TIP]Text-Attentional Convolutional Neural Networks for Scene Text Detection [paper]

[2016-IJDAR]TextCatcher: a method to detect curved and challenging text in natural scenes[paper]

[2016-CVPR]Multi-oriented text detection with fully convolutional networks [paper]

[2015-TPRMI]Real-time Lexicon-free Scene Text Localization and Recognition[paper]

[2015-CVPR]Symmetry-Based Text Line Detection in Natural Scenes[paper][code]

[2015-ICCV]FASText: Efficient unconstrained scene text detector[paper][code]

[2015-D.PhilThesis] Deep Learning for Text Spotting [paper]

[2015 ICDAR]Object Proposals for Text Extraction in the Wild [paper] [code]

[2014-ECCV] Deep Features for Text Spotting [paper] [code] [model] [GitXiv]

[2014-TPAMI] Word Spotting and Recognition with Embedded Attributes [paper] [homepage] [code]

[2014-TPRMI]Robust Text Detection in Natural Scene Images[paper]

[2014-ECCV] Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees [paper]

[2013-ICCV] Photo OCR: Reading Text in Uncontrolled Conditions[paper]

[2012-CVPR]Real-time scene text localization and recognition[paper][code]

[2010-CVPR]Detecting Text in Natural Scenes with Stroke Width Transform [paper] [code]


[2017-arXiv]AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition [paper]

[2017-arXiv]STN-OCR: A single Neural Network for Text Detection and Text Recognition[paper][code]

[2017-arXiv]Auto-Encoder Guided GAN for Chinese Calligraphy Synthesis[paper]

[2017-AAAI-网络图片]Detection and Recognition of Text Embedded in Online Images via Neural Context Models[paper][project]

[2017-arvix 文档识别] Full-Page Text Recognition : Learning Where to Start and When to Stop[paper]

[2016-AAAI]Reading Scene Text in Deep Convolutional Sequences [paper]

[2016-IJCV]Reading Text in the Wild with Convolutional Neural Networks [paper] [demo] [homepage]

[2016-CVPR]Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [paper]

[2016-CVPR] Robust Scene Text Recognition with Automatic Rectification [paper]

[2016-NIPs] Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data[paper]

[2015-CoRR] AnEnd-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition [paper] [code]

[2015-ICDAR]Automatic Script Identification in the Wild[paper]

[2015-ICLR] Deep structured output learning for unconstrained text recognition [paper]

[2014-NIPS]Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition [paperhomepage] [model]

[2014-TIP] A Unified Framework for Multi-Oriented Text Detection and Recognition [paper]

[2012-ICPR]End-to-End Text Recognition with Convolutional Neural Networks [paper] [code] [SVHN Dataset]



1555 images,11459 text instances, includes curved text

COCO-Text (ComputerVision Group, Cornell)2016

63,686images, 173,589 text instances, 3 fine-grained text attributes.

Task:text location and recognition


Synthetic Data for Text Localisation in Natural Image (VGG)2016

800k thousand images

8 million synthetic word instances


Synthetic Word Dataset (Oxford, VGG)2014

9million images covering 90k English words

Task:text recognition, segmentation


IIIT 5K-Words2012

5000images from Scene Texts and born-digital (2k training and 3k testing images)

Eachimage is a cropped word image of scene text with case-insensitive labels

Task:text recognition


StanfordSynth(Stanford, AI Group)2012

Smallsingle-character images of 62 characters (0-9, a-z, A-Z)

Task:text recognition


MSRA Text Detection 500 Database(MSRA-TD500)2012

500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)

Chinese,English or mixture of both

Task:text detection

Street View Text (SVT)2010

350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)

Onlyword level bounding boxes are provided with case-insensitive labels

Task:text location

KAIST Scene_Text Database2010

3000images of indoor and outdoor scenes containing text

Korean,English (Number), and Mixed (Korean + English + Number)

Task:text location, segmentation and recognition


Over74K images from natural images, as well as a set of synthetically generatedcharacters

Smallsingle-character images of 62 characters (0-9, a-z, A-Z)

Task:text recognition

ICDARBenchmark Datasets



Competition Paper

ICDAR 2015

1000 training images and 500 testing images


ICDAR 2013

229 training images and 233 testing images


ICDAR 2011

229 training images and 255 testing images


ICDAR 2005

1001 training images and 489 testing images


ICDAR 2003

181 training images and 251 testing images(word level and character level)



Tesseract: c++ based tools for documents analysis and OCR,support 60+ languages [code]

Ocropy: Python-based tools for document analysis and OCR [code]

CLSTM : A small C++ implementation of LSTM networks,focused on OCR [code]

Convolutional Recurrent Neural Network,Torch7 based [code]

Attention-OCR: Visual Attention based OCR [code]

Umaru: An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm [code]


DeepFont:Identify Your Font from An Image[paper]

Writer-independent Feature Learning for Offline Signature Verification using Deep Convolutional Neural Networks[paper]

End-to-End Interpretation of the French Street Name Signs Dataset [paper] [code]

Extracting text from an image using Ocropus [blog]


[2016-arXiv]Drawingand Recognizing Chinese Characters with Recurrent Neural Network [paper]

Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition [paper]

Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition [paper]

High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps [paper] [github]

DeepHCCR:Offline Handwritten Chinese Character Recognition based on GoogLeNet and AlexNet (With CaffeModel) [code]

如何用卷积神经网络CNN识别手写数字集?[blog][blog1][blog2] [blog4] [blog5] [code6]

Scan,Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTMAttention [paper]

MLPaint:the Real-Time Handwritten Digit Recognizer [blog][code][demo]

caffe-ocr: OCR with caffe deep learning framework [code] (单字分类器)


ReadingCar License Plates Using Deep Convolutional Neural Networks and LSTMs  [paper]

Numberplate recognition with Tensorflow [blog] [code]


ApplyingOCR Technology for Receipt Recognition[blog][mirror]


[2017-Arvix]Using Synthetic Data to Train NeuralNetworks is Model-Based Reasoning[paper]

Using deep learning to break a Captcha system [blog] [code]

Breakingreddit captcha with 96% accuracy [blog] [code]

I'mnot a human: Breaking the Google reCAPTCHA [paper]

NeuralNet CAPTCHA Cracker [slides] [code] [demo]

Recurrentneural networks for decoding CAPTCHAS [blog] [code] [demo]

Readingirctc captchas with 95% accuracy using deep learning [code]


IAm Robot: (Deep) Learning to Break Semantic Image CAPTCHAs [paper]





  1. 金连文:“文字检测与识别:现状及展望” | CAAI AIDL 演讲实录

    点击我爱计算机视觉标星,更快获取CVML新技术 CAAI原创 丨 作者金连文 转自中国人工智能学会,52CV获得金老师授权转载,严禁二次转载. 8月31日-9月1日,由中国人工智能学会主办,华中科技大 ...

  2. 基于YOLOv3 与CRNN的中文自然场景文字检测与识别

    (欢迎关注"我爱计算机视觉"公众号,一个有价值有深度的公众号~) 52CV君曾经分享过多篇关于文字检测与识别的文章: 华科白翔老师团队ECCV2018 OCR论文:Mask Tex ...

  3. 文字检测与识别资料整理

    博主关注文字检测和识别,资料整理和论文解读都非常详细: https://www.cnblogs.com/lillylin/p/6893500.html#4033329 博主的阅读习惯,积累和输出输出: ...

  4. OpenCV 文字检测与识别模块

    OpenCV 文字检测与识别模块 该模块在扩展模块中,需自行下载 下载地址:https://github.com/opencv/opencv_contrib/tree/4.0.0 说明文档: 文字检测 ...

  5. 中文文字检测及识别(ORC)

    中文文字检测及识别(ORC) https://github.com/471417367/chinese_ocr_api 首先基于CTPN检测到文字(可以是中英文以及数字),然后基于RCNN进行文字识别 ...

  6. 文字的检测与识别资源

    持续更新中....... [综述( Survey)] [2016-TIP] Text Detection Tracking and Recognition in Video:A Comprehensi ...

  7. 【项目实践】中英文文字检测与识别项目(CTPN+CRNN+CTC Loss原理讲解)

    点击上方"小白学视觉",选择加"星标"或"置顶" 重磅干货,第一时间送达 本文转自:opencv学堂 OCR--简介 文字识别也是图像领域一 ...

  8. 基础 | OCR文字检测与识别

    作者|Gidi Shperber   编译|AI公园 导读 OCR中的研究,工具和挑战,都在这儿了. 介绍 我喜欢OCR(光学字符识别).对我来说,它代表了数据科学,尤其是计算机视觉的真正挑战.这是一 ...

  9. 文字检测与识别1-MSER

    导语 文字识别在现实场景中的用途非常广泛,现在已经有很多公司将这项技术用于实际中.比如车牌识别,图片转换成文档,拍照搜题,拍照翻译等.这让很多人有了错觉,感觉文字识别的技术已经炉火纯青,可以广泛应用. ...


  1. oauth2中用户的信息如何动态获取和存储_oAuth2.0 简介
  2. 利用 Vmware 安装 Linux 虚拟机
  3. 一次 .NET Core 中玩锁的经历:ManualResetEventSlim, Semaphore 与 SemaphoreSlim
  4. react学习(56)--常见HTTP错误
  5. android 邮件分享链接,Android应用从通过电子邮件发送的链接打开
  6. linux rm 命令删除文件恢复_【Linux】恢复误删除的文件或目录
  7. Codeforce 1700Difficulty Graphs 20 questions
  8. 程序员职业发展三阶段
  9. C语言程序设计,流程图
  10. php安全上传图片,PHP安全上传图片的方法
  11. Excel如何一次性将多行多列表格颠倒行顺序
  12. 树型拓扑计算机网络的缺点是,拓扑结构的优缺点_网状拓扑结构特点_总线的优缺点...
  13. 机票html页面,机票详情页面.html
  14. ssh登录极路由后台_各品牌路由器登录网址大全 路由器默认用户名/密码
  15. canvas快速实现视频的一键截图功能
  16. 遇见Laravel Migrations的migrate与rollback
  17. 卷积到底卷了个啥?--卷积定理详解(一)
  18. ubuntu1704下安装wineQQ
  19. 10.12 快手游戏 客户端 一面40min
  20. 从足球两大类事件来说Map和Enum的巧妙


  1. 好!好!好! 好文章!
  2. 英语c开头语言,C开头的英语谚语大全带汉语
  3. 生物信息学python脚本_Python生物信息学数据管理
  4. 《乔布斯传》圈点(4)
  5. 如何单独清除某个网页的缓存(reload)
  6. 5000词学英语——DAY7
  7. 精英反向学习与黄金正弦优化的HHO算法
  8. 上海擎标助力中国移动山东公司通过ISO22301业务连续性认证
  9. Java实现三国曹操华容道的游戏
  10. 【光程科技】苹果APP安装失败是什么原因?