主要参考官方的编译,梳理一下整个流程

Linux

The build instructions for Linux also apply to other UNIX like operating systems.

Dependencies

  • A compiler for C and C++: GCC or Clang
  • GNU Autotools: autoconf, automake, libtool
  • autoconf-archive
  • pkg-config
  • Leptonica
  • libpng, libjpeg, libtiff

Ubuntu

If they are not already installed, you need the following libraries (Ubuntu 16.04/14.04):

  一、安装依赖:

sudo apt-get install g++ autoconf automake libtool autoconf-archive pkg-config libpng12-dev libjpeg8-dev libtiff5-dev zlib1g-dev  libleptonica-dev -y

或者一条一条复制:
sudo apt-get install g++ # or clang++ (presumably)
sudo apt-get install autoconf automake libtool
sudo apt-get install autoconf-archive
sudo apt-get install pkg-config
sudo apt-get install libpng12-dev
sudo apt-get install libjpeg8-dev
sudo apt-get install libtiff5-dev
sudo apt-get install zlib1g-dev

if you plan to install the training tools, you also need the following libraries:

安装训练所依赖的库:
sudo apt-get install libicu-dev libpango1.0-dev  libcairo2-dev
或者:sudo apt-get install libicu-dev
sudo apt-get install libpango1.0-dev
sudo apt-get install libcairo2-dev

Leptonica

You also need to install Leptonica. Ensure that the development headers for Leptonica are installed before compiling Tesseract.

Tesseract versions and the minimum version of Leptonica required:

二、安装leptonica,

因为tesseract依赖这个库,否则在configure的时候会提示

最新的tesseract 4.0 及3.05 需要从Leptonica 源代码编译

git clone https://github.com/DanBloomberg/leptonica.git

cd leptonica

./configure

make -j8 && make install

Tesseract Leptonica Ubuntu
4.00 1.74.2 Must build from source
3.05 1.74.0 Must build from source
3.04 1.71 Ubuntu 16.04
3.03 1.70 Ubuntu 14.04
3.02 1.69 Ubuntu 12.04
3.01 1.67  

One option is to install the distro's Leptonica package:

sudo apt-get install libleptonica-dev

but if you are using an oldish version of Linux, the Leptonica version may be too old, so you will need to build from source.

The sources are at https://github.com/DanBloomberg/leptonica . The instructions for building are given in Leptonica README.

Note that if building Leptonica from source, you may need to ensure that /usr/local/lib is in your library path. This is a standard Linux bug, and the information at Stackoverflow is very helpful.

Installing Tesseract from Git

Please follow instructions in https://github.com/tesseract-ocr/tesseract/wiki/Compiling--GitInstallation

Also read Install Instructions

三、编译tesseract

clone源代码 :
git clone https://github.com/tesseract-ocr/tesseract.git  tesseract-ocr
cd tesseract-ocr./autogen.sh   autoreconf -i./configure这时会提示:Configuration is done.You can now build and install tesseract by running:

$ make$ sudo make install

Training tools can be built and installed with:

$ make training$ sudo make training-install

继续编译,先编译tesseract,在编译安装 training 
   make  sudo make install  make training  make training-install

    sudo ldconfig

到这就完成了真个编译过程,这个时候 在命令行中 输入tesseract 会提示怎么用。

四、配置字体库
tesseract/tessdata是一个配置目录可以以此为基础把所有用的语言包放在这里面
cd tesseract的父目录cp -r  tesseract/tessdata/ tessdata/下载需要的语言包 https://github.com/tesseract-ocr/tessdata_best 里面有各种语言包,这是训练好的语言包。简体中文下载:chi_sim.traineddata chi_sim_vert.traineddata

下载好的语言包 放在tessdata目录里面

设置环境变量 tessdata的父目录。如:export TESSDATA_PREFIX=/media/sf_E_DRIVE/src-test/tesseract_all/tesseract_linux

 

五、使用tesseract具体用法可参考tesseract的使用说明

tesseract /home/app/1.png output -l chi_sim识别/home/app/1.png这张图片。输出到output.txt 里面,用chi_sim 识别(不用加.traineddata,会默认加)cat output.txt 可以查看刚才的内容

Install elsewhere / without root

Tesseract can be configured to install anywhere, which makes it possible to install it without root access.

To install it in $HOME/local:

./autogen.sh
./configure --prefix=$HOME/local/
make install

To install it in $HOME/local using Leptonica libraries also installed in $HOME/local:

./autogen.sh
LIBLEPT_HEADERSDIR=$HOME/local/include ./configure \--prefix=$HOME/local/ --with-extra-libraries=$HOME/local/lib
make install

Video representation of the Compiling process for Tesseract 4.0 and Leptonica 1.7.4 on Ubuntu 16.xx

  • Video Build from Source Leptonica 1.7.4
  • Video Build from Source Tesseract-OCR 4.0

Language Data

  • Download the data file(s) for the language(s) you interest in.
  • Move it to the tessdata directory (e.g. 'mv tessdata $TESSDATA_PREFIX' if defined TESSDATA_PREFIX)

You can also use:

export TESSDATA_PREFIX=/some/path/to/tessdata

to point to your tessdata directory (example: if your tessdata path is '/usr/local/share/tessdata' you have to use 'export TESSDATA_PREFIX='/usr/local/share/').

转载于:https://www.cnblogs.com/zhishuai/p/7851977.html

ubuntu linux 1604 编译安装tesseract-ocr 4.0相关推荐

  1. Ubuntu linux 手动编译安装 Realtek 8852 无线网卡驱动 非常简单 添加Manjaro教程

    由于8852是刚出的支持wifi6的无线网卡,因此只有Windows驱动,而Linux 用只能自己编译驱动,好在不麻烦. 本驱动置只支持5.4及以上内核,请确认. 安装必要工具: sudo apt-g ...

  2. linux audacity,linux下编译安装音频处理audacity-2.0.3教程

    原创内容,转载请注明出处:http://www.myzhenai.com/thread-15778-1-1.htmlhttp://www.myzhenai.com.cn/post/1247.html ...

  3. linux下源码安装vim,ubuntu 源码编译安装最新的vim 8.0

    为什么要源码编译安装VIM? 因为我要安装ycm,但是ubuntu14.04仓库vim版本低 教程步骤: 1, 核对系统版本 2, 删除系统自带的vim 3, 编译安装vim 4, 检验vim的安装 ...

  4. linux编译安装wine,Ubuntu 13.10 编译安装Wine 1.7

    Ubuntu 13.10 编译安装Wine 1.7 先安装依赖的库: sudo apt-get install flex bison qt4-qmake apt-get install libfree ...

  5. Ubuntu 17.04 编译安装 Nginx 1.9.9 配置 https 免费证书

    Ubuntu 17.04 编译安装 Nginx 1.9.9 配置 https 免费证书 安装 Nginx 安装依赖 $ apt-get update $ apt-get install build-e ...

  6. linux 保存编译log,(转)Linux下编译安装log4cxx

    一个项目的服务器端在Linux平台下,用到了开源日志库log4cxx,这个库是apache项目的一个子库.功能很不错.下面记录下它的编译和安装过程. 第一步安装apr-1.3.8,顺序不能错,它必须首 ...

  7. linux PHP 编译安装参数详解

    linux PHP 编译安装参数详解 ./configure --prefix=/usr/local/php --with-config-file-path=/usr/local/php/etc -- ...

  8. linux iptables 编译,Linux下编译安装iptables

    Linux下如何编译安装iptables实例: 先卸载系统已经安装的iptables,卸载前需备份三个文档:iptables启动脚本,iptables-config配置文档,以及已经建立好的iptab ...

  9. linux系统atom安装教程,Ubuntu/Linux Mint上安装Atom文本编辑器

    Atom是一款由Github开发的开源文本编辑器,虽然目前该软件依然在Beta阶段,但我们依然可以在你的Ubuntu/Linux Mint上使用它. 据Atom官方博客介绍,与Atom类似的编辑器Su ...

  10. Linux apache编译安装

    Linux apache编译安装 1.下载httpd-2.2.15.tar.gz wget  http://mirror.bjtu.edu.cn/apache/httpd/httpd-2.2.17.t ...

最新文章

  1. Using NUnit with Visual Studio 2005 Express Editions
  2. MongoDB常用的操作命令(转)
  3. 数据库范式的思考以及数据库的设计
  4. SQL中binary 和 varbinary的区别
  5. Python可变参数、关键字参数及命名关键字参数
  6. php 回调通知 连连支付_微信小程序支付及退款流程详解
  7. mysql sql
  8. This is why you don’t think you’re creative 你为什么会觉得自己没有创造力?
  9. 搭建一个小型教学办公网络
  10. @PropertySource 注解的使用
  11. 小米、华为、一加、OPPO接连入场,电视的魅力在哪里?
  12. 计算机毕业设计php的校园电影网站系统
  13. Auto.js实现自动删除朋友圈照片
  14. ps photoshop 2023 新功能 简介
  15. crm管理系统是什么意思 crm系统全称是什么 - whale帷幄
  16. LPC23XX CAN波特率的计算
  17. 绘制线性回归和多元线性回归
  18. e成科技人岗匹配中的匹配模型
  19. Caused by: org.springframework.beans.factory.BeanNotOfRequiredTypeException: Bean named 'dao' is exp
  20. 三菱系统四轴正反转参数_三菱第四轴参数.docx

热门文章

  1. 《Pyhton语言程序设计》_第7章_对象和类
  2. java 实现在线预览功能
  3. spring实现定时任务的两种方式
  4. phalcon开发工具(phalcon-devtools)
  5. freeldr 如何调用_BootMain的
  6. [self Introduce]热情洋溢的白羊座
  7. 与Android热更新方案Amigo的亲密接触
  8. JavaScript库
  9. Data crossstore between Mongo and JPA
  10. 应用程序平台应用之星:在线手机应用开发平台 不用搭建环境