opencv-python实际演练(二）军棋自动裁判（5）棋子文字的自动识别

引子

文章《opencv-python实际演练(二）军棋自动裁判（1）棋子图像采集设备DIY》介绍了棋子图像采集仪的制作过程。

文章《opencv-python实际演练(二）军棋自动裁判（3）棋子图像采集设备的改进》对图像采集仪进行了改进。

现在该项目的硬件部分已经准备完毕，接下来要做的就是对采集到的棋子图像进行OCR。如果能够顺利地识别出棋子上的文字，判断两个棋子的棋力大小就比较容易了。

棋子文字识别的实际效果

棋子图像采集仪的实际工作照片如下：

采集仪上未放棋子的效果如下：

采集仪上放置棋子后的效果如下：

测试用的软件界面

棋子图像采集仪上还没有放棋子的效果如下：

放上了棋子之后的效果如下：

点击测试按钮，进行文字识别，效果如下：

从上图可以看出，两个棋子上的文字被正确地识别出来了。识别结果显示在窗口的左上角，在窗口的右边显示了边缘检测的效果图

交换两个棋子放置的孔位，并再次进行识别，结果如下：

比较上面两幅图，可以发现识别出来的内容交换了位置。

故意将“团长”这个棋子旋转180度放置，再次识别，结果如下：

可见反着放的“团长”两个字被识别成了“洪国”，这说明图像如果颠倒，对识别的效果有很大的影响。

OCR的技术路线

目前来看，可以采用本地识别与在线远程识别两种技术路线，在本项目中这两种方法都进行了测试

本地识别使用目前很常见的 tesseract

tesseract的安装分成两个主要步骤

1，下载tesseract 应用程序安装文件，执行安装文件，将tesseract安装在本地计算机上，将tesseract安装路径加入到环境变量path中。
Tesseract的下载地址：https://digi.bib.uni-mannheim.de/tesseract/

选择与当前机器最匹配的 tesseract-ocr-w64-setup-v5.0.0.20190623.exe , 下载并安装到了 C:\Program Files\Tesseract-OCR，并将该目录添加到了环境变量path。另个在安装过程中勾选了简体中文语言包，这样才能正常识别中文

上图将简体中文，繁体中文，以及垂直排列的中文都勾选上了，如果只需要处理简体中文，可以只勾选简体中文选项

2，安装python对tesseract地封装库 pytesseract

使用 pip install pytesseract 即可。由于我以前已经装过 Pillow 了，这次就没有必要再安装一次。如果需要安装，执行
pip install Pillow即可。

假设在当前工作目录下有一个图片：piece1.png

写一个简单的python脚本，可以测试一下tesseract是否可以正常工作
test.py

#!/usr/bin/env python3
# -*- coding: utf-8 -*-import pytesseract
from PIL import Image# open image
image = Image.open('piece1.png')
code = pytesseract.image_to_string(image, lang='chi_sim')
print(code)

测试结果如下：

在线文本识别

在线文本识别的网站很多，有免费的，也有收费的，百度之类的可能需要注册会员，然后有一定的免费额度。

由于网络的影响，在线识别响应的延迟会比较大，本项目以本地识别为主，在线识别只是做一下简单的测试

这次找到了一个免费的网址：OCRMAKER, 试用之后发现其识别率还不错。并且完全免费，不需要注册会员什么的。对它的页面做了简单的网络协议分析之后发现，它其实是调用的一个WebAPI完成的识别功能

用python对这个识别接口做了简单的封装

#在线识别
class onLineOCR:def __init__(self):passurl='https://api.ocr.space/parse/image'   @classmethod def image_to_string(cls,imgFile):files={'img':(imgFile,open(imgFile,'rb'),'image/png')}sdata={'url':'','language':'chs','apikey':'5cbc1fd77788957','isOverlayRequired':'true'}res=requests.post(cls.url,data=sdata,files=files)tmpObj = json.loads(res.text)return tmpObj['ParsedResults'][0]['ParsedText']

引入pygame

由于opencv-python不太方便与用户交互，因此使用pygame做图形交互界面。

在pygame上显示图像也比较方便，但有一点需要注意，opencv与pygame在显示图像上有一定的区别，有些人采用图像文件做为中介，在opencv与pygame之间传递图像数据。当然也可以不使用图像文件，直接使用内存对象。

如以下代码所示：

#在pygame上显示图像
def showImgOnScreen(img,pos,isBGR=True):imgFrame=np.rot90(img)imgFrame = cv2.flip(imgFrame,1,dst=None) #水平镜像    if isBGR:#cv2用的是BGR颜色空间，pygame用的是RGB颜色空间，需要做一个转换imgFrame=cv2.cvtColor(imgFrame,cv2.COLOR_BGR2RGB)#pygame不能直接显示numpy二进制数组数据，需要转换成surface才能正常显示imgSurf=pygame.surfarray.make_surface(imgFrame)Config.screen.blit(imgSurf, pos)

其中很关键的一句就是

 #pygame不能直接显示numpy二进制数组数据，需要转换成surface才能正常显示imgSurf=pygame.surfarray.make_surface(imgFrame)

从以上代码也可以看出，opencv捕获的图像，在方向、颜色空间上都与pygame有区别，要做相应的转换

另外，为了在pygame上使用按钮，在网上找到了一个实现了按钮功能的类文件

bf_button.py

# -*- coding=utf-8 -*-
import threading
import pygame
from pygame.locals import MOUSEBUTTONDOWNclass BFControlId(object):_instance_lock = threading.Lock()def __init__(self):self.id = 1@classmethoddef instance(cls, *args, **kwargs):if not hasattr(BFControlId, "_instance"):BFControlId._instance = BFControlId(*args, **kwargs)return BFControlId._instancedef get_new_id(self):self.id += 1return self.idCLICK_EFFECT_TIME = 100
class BFButton(object):def __init__(self, parent, rect, text='Button', click=None):self.x,self.y,self.width,self.height = rectself.bg_color = (225,225,225)self.parent = parentself.surface = parent.subsurface(rect)self.is_hover = Falseself.in_click = Falseself.click_loss_time = 0self.click_event_id = -1self.ctl_id = BFControlId().instance().get_new_id()self._text = textself._click = clickself._visible = Trueself.init_font()def init_font(self):#font = pygame.font.Font(None, 28)font = pygame.font.Font("C:\Windows\Fonts\STSONG.TTF", 20) white = 100, 100, 100self.textImage = font.render(self._text, True, white)w, h = self.textImage.get_size()self._tx = (self.width - w) / 2self._ty = (self.height - h) / 2@propertydef text(self):return self._text@text.setterdef text(self, value):self._text = valueself.init_font()@propertydef click(self):return self._click@click.setterdef click(self, value):self._click = value@propertydef visible(self):return self._visible@visible.setterdef visible(self, value):self._visible = valuedef update(self, event):if self.in_click and event.type == self.click_event_id:if self._click: self._click(self)self.click_event_id = -1returnx, y = pygame.mouse.get_pos()if x > self.x and x < self.x + self.width and y > self.y and y < self.y + self.height:self.is_hover = Trueif event.type == MOUSEBUTTONDOWN:pressed_array = pygame.mouse.get_pressed()if pressed_array[0]:self.in_click = Trueself.click_loss_time = pygame.time.get_ticks() + CLICK_EFFECT_TIMEself.click_event_id = pygame.USEREVENT+self.ctl_idpygame.time.set_timer(self.click_event_id,CLICK_EFFECT_TIME-10)else:self.is_hover = Falsedef draw(self):if self.in_click:if self.click_loss_time < pygame.time.get_ticks():self.in_click = Falseif not self._visible:returnif self.in_click:r,g,b = self.bg_colork = 0.95self.surface.fill((r*k, g*k, b*k))else:self.surface.fill(self.bg_color)if self.is_hover:pygame.draw.rect(self.surface, (0,0,0), (0,0,self.width,self.height), 1)pygame.draw.rect(self.surface, (100,100,100), (0,0,self.width-1,self.height-1), 1)layers = 5r_step = (210-170)/layersg_step = (225-205)/layersfor i in range(layers):pygame.draw.rect(self.surface, (170+r_step*i, 205+g_step*i, 255), (i, i, self.width - 2 - i*2, self.height - 2 - i*2), 1)else:self.surface.fill(self.bg_color)pygame.draw.rect(self.surface, (0,0,0), (0,0,self.width,self.height), 1)pygame.draw.rect(self.surface, (100,100,100), (0,0,self.width-1,self.height-1), 1)pygame.draw.rect(self.surface, self.bg_color, (0,0,self.width-2,self.height-2), 1)self.surface.blit(self.textImage, (self._tx, self._ty))

完整的python代码

除了以上的bf_button.py外，图像采集及识别的代码都放在了下面的文件中
guiJudge.py

#coding:utf-8
#将两个棋子的内容从棋子图像采集器中提取出来，调用tesseract识别出文字后判断两个棋子的棋力大小
#当前系统中安装的是 tesseract5.0 #图像预处理所需的模块
import cv2
import numpy as np
import math
import pytesseract
import pygame
from bf_button import BFButton
#在经ocr所需的模块
import requests
import json#配置数据
class Config:def __init__(self):pass   src = "camera/piece1.png"resizeRate = 0.5min_area = 30000min_contours = 8threshold_thresh = 180epsilon_start = 10epsilon_step = 5result =[]screen=Noneframe=None#在线识别
class onLineOCR:def __init__(self):passurl='https://api.ocr.space/parse/image'   @classmethod def image_to_string(cls,imgFile):files={'img':(imgFile,open(imgFile,'rb'),'image/png')}sdata={'url':'','language':'chs','apikey':'5cbc1fd77788957','isOverlayRequired':'true'}res=requests.post(cls.url,data=sdata,files=files)tmpObj = json.loads(res.text)return tmpObj['ParsedResults'][0]['ParsedText']'''
对坐标点进行排序
@return     [top-left, top-right, bottom-right, bottom-left]
'''
def order_points(pts):# initialzie a list of coordinates that will be ordered# such that the first entry in the list is the top-left,# the second entry is the top-right, the third is the# bottom-right, and the fourth is the bottom-leftrect = np.zeros((4, 2), dtype="float32")# the top-left point will have the smallest sum, whereas# the bottom-right point will have the largest sums = pts.sum(axis=1)rect[0] = pts[np.argmin(s)]rect[2] = pts[np.argmax(s)]# now, compute the difference between the points, the# top-right point will have the smallest difference,# whereas the bottom-left will have the largest differencediff = np.diff(pts, axis=1)rect[1] = pts[np.argmin(diff)]rect[3] = pts[np.argmax(diff)]# return the ordered coordinatesreturn rect# 求两点间的距离
def point_distance(a,b):return int(np.sqrt(np.sum(np.square(a - b))))# 找出外接四边形, c是轮廓的坐标数组
def boundingBox(idx,c,image):if len(c) < Config.min_contours:print("the contours length is  less than %d ,need not to find boundingBox,idx = %d "%(Config.min_contours,idx)) return Noneepsilon = Config.epsilon_startwhile True:approxBox = cv2.approxPolyDP(c,epsilon,True)#显示拟合的多边形#cv2.polylines(image, [approxBox], True, (0, 255, 0), 2)#cv2.imshow("image", image)        if (len(approxBox) < 4):print("the approxBox edge count %d is  less than 4 ,need not to find boundingBox,idx = %d "%(len(approxBox),idx)) return None#求出拟合得到的多边形的面积theArea = math.fabs(cv2.contourArea(approxBox))#输出拟合信息print("contour idx: %d ,contour_len: %d ,epsilon: %d ,approx_len: %d ,approx_area: %s"%(idx,len(c),epsilon,len(approxBox),theArea))if theArea > Config.min_area:if (len(approxBox) > 4):# epsilon 增长一个步长值epsilon += Config.epsilon_step               continueelse: #approx的长度为4，表明已经拟合成矩形了                #转换成4*2的数组approxBox = approxBox.reshape((4, 2))                            return approxBox                else:#尝试计算外接矩形,当棋子上的笔画确到了外边缘，会造成外轮廓不再是矩形，面积缩小，这时尝试用外接矩形来包住这种外轮廓print("try boundingRect")             x, y, w, h = cv2.boundingRect(c)if w*h > Config.min_area:                approxBox = [[x,y],[x+w,y],[x+w,y+h],[x,y+h]]                approxBox = np.int0(approxBox)                return approxBox else:print("It is too small ,need not to find boundingBox,idx = %d area=%f"%(idx, theArea))return None#提取目标区域，并对提取的图像进行文字识别
def pickOut(srcImg=None):Config.result =[]if srcImg is None:# 开始图像处理，读取图片文件image = cv2.imread(Config.src)else:image =srcImg#print(image.shape)#获取原始图像的大小srcHeight,srcWidth ,channels = image.shape#对原始图像进行缩放#image= cv2.resize(image,(int(srcWidth*Config.resizeRate),int(srcHeight*Config.resizeRate))) #cv2.imshow("image", image)#转成灰度图gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) #cv2.imshow("gray", gray)# 中值滤波平滑，消除噪声# 当图片缩小后，中值滤波的孔径也要相应的缩小，否则会将有效的轮廓擦除binary = cv2.medianBlur(gray,7)#binary = cv2.medianBlur(gray,3)  #转换为二值图像ret, binary = cv2.threshold(binary, Config.threshold_thresh, 255, cv2.THRESH_BINARY)#显示转换后的二值图像#cv2.imshow("binary", binary)# 进行2次腐蚀操作（erosion）# 腐蚀操作将会腐蚀图像中白色像素，可以将断开的线段连接起来erode = cv2.erode (binary, None, iterations = 2)#显示腐蚀后的图像#cv2.imshow("erode", erode)# canny 边缘检测canny = cv2.Canny(erode, 0, 60, apertureSize = 3)#显示边缘检测的结果#cv2.imshow("Canny", binary)showImgOnScreen(canny,(640,0),False)# 提取轮廓contours,_ = cv2.findContours(canny, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)# 输出轮廓数目print("the count of contours is  %d \n"%(len(contours)))#显示轮廓#cv2.drawContours(image,contours,-1,(0,0,255),1)#cv2.imshow("image", image)#针对每个轮廓，拟合外接四边形,如果成功，则将该区域切割出来，作透视变换，并保存为图片文件for idx,c in enumerate(contours):approxBox = boundingBox(idx,c,image)if approxBox is None: print("\n")continue#显示拟合结果#cv2.polylines(image, [approxBox], True, (0, 0, 255), 2)#cv2.imshow("image", image)# 待切割区域的原始位置，# approxPolygon 点重排序, [top-left, top-right, bottom-right, bottom-left]src_rect = order_points(approxBox)  print("src_rect：\n",src_rect)        # 获取最小矩形包络rect = cv2.minAreaRect(approxBox)box = cv2.boxPoints(rect)box = np.int0(box)box = box.reshape(4,2)box = order_points(box)print("boundingBox：\n",box)         w,h = point_distance(box[0],box[1]), point_distance(box[1],box[2])print("w = %d ,h= %d "%(w,h))# 生成透视变换的目标区域dst_rect = np.array([[0, 0],[w , 0],[w , h ],[0, h]],dtype="float32")# 得到透视变换矩阵M = cv2.getPerspectiveTransform(src_rect, dst_rect)#得到透视变换后的图像#warped = cv2.warpPerspective(image, M, (w, h))warped = cv2.warpPerspective(binary, M, (w, h))#对提取的结果进行文本识别#codeImg = np.vstack((warped, warped))#codeImg = np.vstack((codeImg, codeImg))#codeImg=np.rot90(codeImg,-1)codeImg=np.rot90(warped,-1)# 调用本地识别接口code = pytesseract.image_to_string(codeImg, lang='chi_sim')Config.result.append(code)print(code)#将变换后的结果图像写入png文件#Config.src = "output/piece%d.png"%idx#cv2.imwrite(Config.src , codeImg, [int(cv2.IMWRITE_PNG_COMPRESSION), 9])#调用在线识别接口（在线接口受网络的影响，耗时较长）#code = onLineOCR.image_to_string(Config.src)#Config.result.append(code)#print(code)print("\n")#在pygame上显示图像
def showImgOnScreen(img,pos,isBGR=True):imgFrame=np.rot90(img)imgFrame = cv2.flip(imgFrame,1,dst=None) #水平镜像    if isBGR:#cv2用的是BGR颜色空间，pygame用的是RGB颜色空间，需要做一个转换imgFrame=cv2.cvtColor(imgFrame,cv2.COLOR_BGR2RGB)#pygame不能直接显示numpy二进制数组数据，需要转换成surface才能正常显示imgSurf=pygame.surfarray.make_surface(imgFrame)Config.screen.blit(imgSurf, pos)  #按扭事件处理
def do_click1(btn):if Config.frame is None:print('Config.frame is None')else:pickOut(Config.frame)
#-----------------------------------------------------------------------------------------------pygame.init()screen = pygame.display.set_mode([1280,600]) #设置图形窗口大小为800*600
Config.screen = screen
pygame.display.set_caption("军棋自动裁判,T键执行测试") #设置图形窗口标题BLACK = (0,0,0)       # 用RGB值定义黑色
WHITE = (255,255,255) # 用RGB值定义白色
BROWN = (166,134,95)  # 用RGB值定义棕色#生成按钮对象
button1 = BFButton(screen, (120,500,160,40))
button1.text = '测试'
button1.click = do_click1#准备捕捉摄像头内容
camera = cv2.VideoCapture(0)
#设置显示中文所用的字体
font = pygame.font.Font("C:\Windows\Fonts\STSONG.TTF", 24)
#窗口背景
screen.fill(BLACK)  #生成一个定时器对象
timer = pygame.time.Clock() keepGoing = True
while keepGoing:    # 事件处理循环#显示摄像头内容success, frame = camera.read()Config.frame = frame#cv2直接显示捕获的图像没有问题，但要在pygame中正常显示，还要做一些处理#cv2.imshow('MyCamera',frame)showImgOnScreen(frame,(0,0),True)for event in pygame.event.get(): if event.type == pygame.QUIT: keepGoing = Falseif event.type == pygame.KEYDOWN:          # 如果按下了键盘上的键if event.key == pygame.K_t:        # 如果按下't'pickOut(frame)elif event.key == pygame.K_RIGHT:     #如果按下了向右的方向键pickOut(frame)button1.update(event)# 输出提示信息tip = "测试结果: "      for code in Config.result:tip+=code+","text = font.render(tip, True, WHITE)text_rect = text.get_rect()#text_rect.centerx = screen.get_rect().centerxtext_rect.x = 10  text_rect.y = 10    screen.blit(text, text_rect)pygame.display.update() #刷新窗口button1.draw()timer.tick(30)          #设置帧率不超过30 pygame.quit()       # 退出