OpenCV2.4.13 文本分割（水平垂直，直方图投影）

进行文字分割时，有多种方法，对与不同情况可以分别处理。
问题1：如何进行文字分割？
答：对于文字是一般正规（不同行的文字一样高，每一行内部文字大致一样宽）的文本的情况。
这里给出了一种方法。
1）对图像二值化
2）对二值化之后的图像进行水平方向投影（找到不同行）
3）利用2）得到的结果对二值化图像切割，然后对每一行进行垂直方向的投影（找到每一行内的不同文字）
4）利用 2）和3）得到的结果画出方框。
本文是与这里的方法对应的C++实现，在这里使用C#实现的。

本文尽量对所使用到的代码进行相近的解释。

先读取图片

    Mat img = imread(IMG_PATH);if (img.empty()){cerr<<"can not read image"<<endl;}imshow("original image", img);

显示结果：

第一步：1）对图像二值化

    // step 1) 对图像二值化，这里因为使用 otsu 必须是单通道，//所以先将图像变成 单通道的图像Mat gray_img;cvtColor(img,gray_img,CV_BGR2GRAY,1);Mat binary_img;threshold(gray_img,binary_img,90,255,THRESH_OTSU);binary_img = 255 - binary_img;imshow("binary image by otsu", binary_img);

第二步：step 2) 对二值化之后的图像进行水平方向投影（找到不同行）

    Mat hist_ver;reduce(binary_img/255,hist_ver,1,CV_REDUCE_SUM,CV_32S);int width = 5;int totaln = max(hist_ver.rows,hist_ver.cols);Mat locations = Mat::zeros(3,totaln,CV_32S);int count = 0;Find_begin_end_len(hist_ver,locations,count,width);

问题：reduce（）是什么意思呢？
答：reduce（）是，将图像，沿着某个方向，做某种“降维”
这里的意思是，将图像沿着横轴做“求和”运算，最后得到的是一个一维向量。
问题：Find_begin_end_len（）是什么鬼呢？
答：自己定义的一个函数，找到直方图中相连区域的开始与结束部分的位置。
问题：这个函数的想法是什么呢？
答：输入：一个表示直方图的向量h_vec；
输出：矩阵locations

第一行	第二行	第三行
开始的位置	结束的位置	这一段不为零的直方图的长度
begin	end	len

代码如下：

void Find_begin_end_len(Mat h_vec,Mat& locations,int& count, int width){// locations 为 3*N 大小的 全零的 Mat，// 经过这个函数处理之后，变成 3*count 大小的矩阵if (locations.type() != 4 || h_vec.type() != 4)cout <<"locations and h_vec must be type of CV_32S"<<endl;// 将 h_vec 变成一个  一行多列 的矩阵，便于处理if (h_vec.rows != 1)transpose(h_vec,h_vec);int N = h_vec.cols;int begin, end, len;count = 0;int n = 0;  // number of pixels in h_vec[i]for (int i = 0; i < N; i++){//cout <<" i is: "<< i<<endl;n = h_vec.at<int>(0,i);if (n != 0){begin = i;for (int j = i; j < N; j++){n = h_vec.at<int>(0,j);if (n == 0){end = j-1;len = end - begin;if (len >= width){locations.at<int>(0,count) = begin;locations.at<int>(1,count) = end;locations.at<int>(2,count) = len;count = count + 1; test if the code is right//cout <<" begin is: "<< begin<<endl;//cout <<" end is: "<< end<<endl;//cout <<" len is: "<< len<<endl;//cout <<" count is: "<< count<<endl;}i = j;break;}}}}}

问题：为什么locations and h_vec must be type of CV_32S？
答：因为直方图向量是通过对图像进行的操作是 “求和”，因此新得到的直方图向量中分量的数值可能超出图像像素类型的范围。
这里记录位置的locations 与直方图向量类似是一致的，因此要用CV_32S了。
问题：为什么一定要是 CV_32S呢？
答：这个，我也不太清楚噢。
不过，CV_32S对应的是 int 型。

第三步： step 3）利用2）得到的结果对二值化图像切割，
然后对每一行进行垂直方向的投影（找到每一行内的不同文字）

    Mat line;int x,y,height;x = 0;Mat hist_hor;Mat locations2 = Mat::zeros(3,totaln,CV_32S);list<Rect> blocks; // 定义一个链表list<Rect>::iterator p_list; // 定义一个链表中的迭代器int count2 = 0;Rect r1;int bx,by,bwid,bhei;width = 2;for (int i = 0; i < count; i++){y = locations.at<int>(0,i);height = locations.at<int>(2,i);line = binary_img(Rect(x , y , binary_img.cols,height));reduce(line/255,hist_hor,0,CV_REDUCE_SUM,CV_32S);Find_begin_end_len(hist_hor,locations2,count2,width);// 利用链表存储 Rect 区域for (int j = 0; j < count2; j++){bx = locations2.at<int>(0,j);by = locations.at<int>(0,i);bwid = locations2.at<int>(2,j);bhei = locations.at<int>(2,i);r1 = Rect(bx,by,bwid,bhei);blocks.push_back(r1);}}

问题：Rect r1是什么意思？
答：1）Rect 是一种类型，与 int 类似
2）Rect 是一个函数，Rect(x,y, width,height):指定长方形区域，左上角位于(x,y)，矩形大小为，常用来指定roi
这里是用来记录，每一个字符所在的位置的。

第四步： step 4) 利用 2）和3）得到的结果画出方框。

    Scalar color = Scalar(0, 0, 255);for (p_list = blocks.begin(); p_list != blocks.end(); p_list++)rectangle(img,*p_list,color );imshow("image with box", img);

问题：Scalar color 表示什么意思？
答：Scalar 与 Rect类似，有两重意义。
不过，Scalar 有更多的含义，这里只使用到了最简单的一种。
最终结果：

问题：有些字连在了一起，这个要怎么处理？
答：法一：可以在第一步阈值处理之前或者之后利用形态学滤波做预处理。
不过这样的话，需要引入更多参数。
法二：可以对最终分在一起的一串数字进行后处理。
不过，这样的话，本来错误分在一起的就不能再分开了。

放大招：整体代码如下：

// csdn_code.cpp : 定义控制台应用程序的入口点。
//#include "stdafx.h"
#include <iostream>
#include <opencv2/opencv.hpp>using namespace cv;
using namespace std;#define IMG_PATH  "..//figures//111.jpg"void Find_begin_end_len(Mat h_vec,Mat& locations,int& count, int width);void Find_begin_end_len(Mat h_vec,Mat& locations,int& count, int width){// locations 为 3*N 大小的 全零的 Mat，// 经过这个函数处理之后，变成 3*count 大小的矩阵if (locations.type() != 4 || h_vec.type() != 4)cout <<"locations and h_vec must be type of CV_32S"<<endl;// 将 h_vec 变成一个  一行多列 的矩阵，便于处理if (h_vec.rows != 1)transpose(h_vec,h_vec);int N = h_vec.cols;int begin, end, len;count = 0;int n = 0;  // number of pixels in h_vec[i]for (int i = 0; i < N; i++){//cout <<" i is: "<< i<<endl;n = h_vec.at<int>(0,i);if (n != 0){begin = i;for (int j = i; j < N; j++){n = h_vec.at<int>(0,j);if (n == 0){end = j-1;len = end - begin;if (len >= width){locations.at<int>(0,count) = begin;locations.at<int>(1,count) = end;locations.at<int>(2,count) = len;count = count + 1; test if the code is right//cout <<" begin is: "<< begin<<endl;//cout <<" end is: "<< end<<endl;//cout <<" len is: "<< len<<endl;//cout <<" count is: "<< count<<endl;}i = j;break;}}}}}int main()
{Mat img = imread(IMG_PATH);if (img.empty()){cerr<<"can not read image"<<endl;}imshow("original image", img);// step 1) 对图像二值化，这里因为使用 otsu 必须是单通道，//所以先将图像变成 单通道的图像Mat gray_img;cvtColor(img,gray_img,CV_BGR2GRAY,1);Mat binary_img;threshold(gray_img,binary_img,90,255,THRESH_OTSU);binary_img = 255 - binary_img;imshow("binary image by otsu", binary_img);// step 2) 对二值化之后的图像进行水平方向投影（找到不同行）Mat hist_ver;reduce(binary_img/255,hist_ver,1,CV_REDUCE_SUM,CV_32S);int width = 5;int totaln = max(hist_ver.rows,hist_ver.cols);Mat locations = Mat::zeros(3,totaln,CV_32S);int count = 0;Find_begin_end_len(hist_ver,locations,count,width);// step 3）利用2）得到的结果对二值化图像切割，// 然后对每一行进行垂直方向的投影（找到每一行内的不同文字）Mat line;int x,y,height;x = 0;Mat hist_hor;Mat locations2 = Mat::zeros(3,totaln,CV_32S);list<Rect> blocks; // 定义一个链表list<Rect>::iterator p_list; // 定义一个链表中的迭代器int count2 = 0;Rect r1;int bx,by,bwid,bhei;width = 2;for (int i = 0; i < count; i++){y = locations.at<int>(0,i);height = locations.at<int>(2,i);line = binary_img(Rect(x , y , binary_img.cols,height));reduce(line/255,hist_hor,0,CV_REDUCE_SUM,CV_32S);Find_begin_end_len(hist_hor,locations2,count2,width);// 利用链表存储 Rect 区域for (int j = 0; j < count2; j++){bx = locations2.at<int>(0,j);by = locations.at<int>(0,i);bwid = locations2.at<int>(2,j);bhei = locations.at<int>(2,i);r1 = Rect(bx,by,bwid,bhei);blocks.push_back(r1);}}// step 4) 利用 2）和3）得到的结果画出方框。Scalar color  = Scalar(0, 0, 255);for (p_list = blocks.begin(); p_list != blocks.end(); p_list++)rectangle(img,*p_list,color );imshow("image with box", img);waitKey();system("pause");return 0;
}

问题：如果有大小不一样的文字怎么办呢？
比如这种：

利用大招中的方法得到的结果是如下：

答：利用这里的方法。

OpenCV2.4.13 文本分割（水平垂直，直方图投影）相关推荐

文本分割之垂直投影法基于OpenCV(python)的实现
在我的上一篇博客中讲述了水平投影法取出文本行图像的实现,在这里将用垂直投影法对文本行的每个字符进行分割.下图是用水平投影法切割的文本行: 文本分割的原理如下,先用水平投影取出单一文本行,接着使用垂直投 ...
html文本阴影水平垂直,CSS中使用文本阴影与元素阴影效果
文本阴影介绍在CSS中使用text-shadow属性设置文本阴影,该属性一共有4个属性值如:水平阴影.垂直阴影.(清晰度或模糊距离).阴影颜色. text-shadow属性值说明,在文本阴影实践中: ...
文本分割之水平投影法基于OpenCV(python)版实现
对于如下一张图片,如何将文本区域分割成一行一行的了? 在文本分割领域中有一种很优秀的算法:投影法,包括水平投影法和垂直投影法.本文主要讲述水平投影法,水平投影法可以理解为一束光线从图像的左侧向右边进行 ...
用SQL查询创建水平、垂直直方图
mysql> select * from t1; +--------+ | deptno | +--------+ | 10 | | 10 | | 10 | | 20 | | 20 | | 20 ...
mysql数据库水平分割_数据库的水平分割和垂直分割
在数据库操作中,我们常常会听说这两个词语:水平分割和垂直分割.那么到底什么是数据库的水平分割,什么是数据库的垂直分割呢?本文我们就来介绍一下这部分内容. 1.水平分割: 按记录进分分割,不同的记录可以 ...
SilverLight：布局（2）GridSplitter（网格分割）垂直分割、水平分割
ylbtech-SilverLight-Layout: 布局(2)GridSplitter(网格分割)垂直分割.水平分割 A, Splitter(分割)对象之 GridSplitter(网格分割)1: ...
字体族、图标字体简介、图标字体的其他使用方式、IconFont、行高、字体的简写属性、文本的水平和垂直对齐、其他的文本样式——06fontbackground
目录一.字体族二.图标字体简介(font awesome的使用) 三.图标字体的其他使用方式四.iconfont 五.行高六.字体的简写属性七.文本的水平和垂直对齐八.其他的文本样式九. ...
opencv学习---计算图像的水平积分投影和垂直积分投影
opencv学习---计算图像的水平积分投影和垂直积分投影标签: opencv水平积分投影垂直积分投影 2016-12-07 18:48 1806人阅读评论(1) 收藏举报分类: opencv ...
2800:垂直直方图
2800:垂直直方图总时间限制: 1000ms 内存限制: 65536kB 描述输入4行全部由大写字母组成的文本,输出一个垂直直方图,给出每个字符出现的次数.注意:只用输出字符的出现次数,不用 ...

OpenCV2.4.13 文本分割（水平垂直，直方图投影）

OpenCV2.4.13 文本分割（水平垂直，直方图投影）相关推荐

最新文章

热门文章