最近做了个手写汉字简历识别比赛,需要先提取表格中含有指定信息的各个框,再用TensorFlow对框中的信息进行汉字、数字、英文识别。代码已开源:https://github.com/BingLiHanShuang/chinese_ocr,需要额外下载训练好的模型文件:https://pan.baidu.com/s/1Q0dPSKILNxPMDn7i2VIhow (或不使用百度云(拷进网址栏):https://d.pcs.baidu.com/file/8c755e26ce17286ed911405a7644f73f?fid=4031771572-250528-760858459813394&dstime=1544602011&rt=sh&sign=FDtAERVY-DCb740ccc5511e5e8fedcff06b081203-GHzypluGZoBNdm99UChQOUdYJJ0%3D&expires=8h&chkv=1&chkbd=0&chkpc=et&dp-logid=8017096450883646029&dp-callid=0&shareid=3190055425&r=806607689 )

整体的工作量有点大,此篇先介绍如何对表格中各个框进行提取,此部分代码位于table_choose.py文件中的前半部分,本文中摘取部分进行说明,具体可见上文开源Github代码链接。笔者先前用的都是C++版的OpenCV,因为TensorFlow识别部分的代码基于python,为了减少工作量,这次就顺便使用了Python版的OpenCV,需要在OpenCV编译时补全Python项,并复制cv2.so和cv.py文件【重要!否则import cv2会提示找不到module,见https://www.cnblogs.com/freeweb/p/5794447.html】。

如今OpenCV官方的Document里都包含了各API的Python版用法,从C++转换到Python用起来意外的顺手。这里附上官方的在线Document地址,记得选对应版本:https://docs.opencv.org/

笔者该部分的编译环境如下:

(1)Ubuntu16.04 64位 支持utf-8编码
(2)Python2.7
(3)OpenCV3.4.3 (与Python2.7编译通)
(4)Python2.7 numpy模块、PIL模块、logging模块、pickle模块、os模块、random模块、time模块、matplotlib模块、math模块、csv模块

先一睹Python+OpenCV校正并提取表格中的各个框的整体代码table_choose.py(其中包含识别部分的功能,提取表格的功能需要摘出来使用):

#!/usr/bin/env python
# encoding: utf-8  import os
import cv2 as cv
import numpy as np
import mathimport num_out
import mnist_recognize
import chinese_out
import chinese_out_jiguan
import time_outimport csv##### 初始化CSV文件 #####
csvfile = open("try1010.csv", 'w')
csvwrite = csv.writer(csvfile)
fileHeader = ["登记表编号","性别","民族","体重","血型","籍贯","高中学校名称","高中专业","高中学位","高中起止时间","高中是否毕业","大专学校名称","大专专业","大专学位","大专起止时间","大专是否毕业","本科学校名称","本科专业","本科学位","本科起止时间","本科是否毕业","研究生学校名称","研究生专业","研究生学位","研究生起止时间","研究生是否毕业","学校名称(其它)","专业(其它)","学位(其它)","起止时间(其它)","是否毕业(其它)"]
csvwrite.writerow(fileHeader)path = 'test_data'
for (path,dirs,files) in os.walk(path):for filename in files:#docu_num = '20110158' #测试单张登记表##### 按文件名读取登记表 #####(docu_num,extension) = os.path.splitext(filename)print "filename:", filename##### 根据文件名年份的不同区分不同类型的登记表 #####print docu_num[3]if docu_num[3] == '1':if len(docu_num) == 9 or (docu_num[5] == '3' and docu_num[6] == '1'):table_style = 2else:table_style = 1if docu_num[3] == '3':table_style = 3if docu_num[3] == '4' or docu_num[3] == '5':table_style = 4jiaozheng = 0image = cv.imread('test_data/' + docu_num + '.jpg')rows, cols, channels = image.shapeprint rows, colsimage_copy = image.copy()##### 旋转校正Rotation ######统计图中长横线的斜率来判断整体需要旋转矫正的角度gray = cv.cvtColor(image, cv.COLOR_RGB2GRAY)if table_style == 1 or table_style == 3 or table_style == 2:edges = cv.Canny(gray, 50, 150, apertureSize=3)  # 50,150,3cv.imwrite('edges_whole.jpg', edges)lines = cv.HoughLinesP(edges, 1, np.pi / 180, 500, 0, minLineLength=50, maxLineGap=50)#650,50,20if table_style == 4:edges_gray = cv.Canny(gray, 50, 150, apertureSize=3)  # 50,150,3edges = edges_gray[400:1000, 0:1000]cv.imwrite('edges_whole.jpg', edges)lines = cv.HoughLinesP(edges, 1, np.pi / 180, 200, 0, minLineLength=50, maxLineGap=35)#650,50,20pi = 3.1415theta_total = 0theta_count = 0for line in lines:x1, y1, x2, y2 = line[0]if table_style == 4:y1 = y1 + 400y2 = y2 + 400rho = np.sqrt((x1 - x2)**2 + (y1 - y2)**2)theta = math.atan(float(y2 - y1)/float(x2 - x1 + 0.001))print(rho, theta, x1, y1, x2, y2)if theta < pi/4 and theta > -pi/4:theta_total = theta_total + thetatheta_count+=1cv.line(image_copy, (x1, y1), (x2, y2), (0, 0, 255), 2)#cv.line(edges, (x1, y1), (x2, y2), (0, 0, 0), 2)theta_average = theta_total/theta_countprint theta_average, theta_average*180/picv.imwrite('line_detect4rotation.jpg', image_copy)#cv.imwrite('line_detect4rotation.jpg', ~edges)#affineShrinkTranslationRotation = cv.getRotationMatrix2D((cols/2, rows/2), theta_average*180/pi, 1)affineShrinkTranslationRotation = cv.getRotationMatrix2D((0, rows), theta_average*180/pi, 1)ShrinkTranslationRotation = cv.warpAffine(image, affineShrinkTranslationRotation, (cols, rows))image_copy = cv.warpAffine(image_copy, affineShrinkTranslationRotation, (cols, rows))cv.imwrite('image_Rotation.jpg',ShrinkTranslationRotation)##### 平移校正Move ######通过对表格左下角直角进行识别,将其顶点统一平移矫正至(78,1581)#print "rows: ",rowsroi = image_copy[1450:rows, 0:150]#180gray = cv.cvtColor(roi, cv.COLOR_RGB2GRAY)edges = cv.Canny(gray, 50, 150, apertureSize=3)  # 50,150,3roi_mean_set = cv.mean(~edges[0:int((rows-1450)/2), 85:150])#通过区域灰度值特征排除文字对直线识别的干扰roi_mean = roi_mean_set[0]#cv.imwrite('edges_sample.jpg', edges)cv.imwrite('edges_sample.jpg', ~edges[0:int((rows-1450)/2), 75:150])lines = cv.HoughLinesP(edges, 1.0, np.pi / 180, 35, 0, minLineLength=10,maxLineGap=20)#50,10,20lines_message_set = []for line in lines: x1, y1, x2, y2 = line[0]y1 = y1 + 1450y2 = y2 + 1450rho = np.sqrt((x1 - x2)**2 + (y1 - y2)**2)xielv = (y2 - y1)/(x2 - x1 + 0.001)theta = math.atan(float(y2 - y1)/float(x2 - x1 + 0.001))print(rho, theta, x1, y1, x2, y2, xielv)lines_message = (rho, theta, x1, y1, x2, y2, xielv)#cv.line(image, (x1, y1), (x2, y2), (0, 0, 255), 2)lines_message_set.append(lines_message)#print len(lines_message_set)#求点到直线的距离def point2line_distance(x1,y1,x2,y2,pointPx,pointPy):A = y1 - y2B = x2 - x1C = x1*y2 - y1*x2distance = abs(A*pointPx + B*pointPy + C)/((A*A + B*B)**0.5)return distancelines_cluster_set = []repeat_num_set = []   for j in range(0,len(lines_message_set)):for i in range(0,len(lines_message_set)):if not(j in repeat_num_set):lines_cluster_set.append(lines_message_set[j])repeat_num_set.append(j)print point2line_distance(lines_message_set[i][2], lines_message_set[i][3], lines_message_set[i][4], lines_message_set[i][5], (lines_message_set[j][2]+lines_message_set[j][4])/2, (lines_message_set[j][3]+lines_message_set[j][5])/2)if i!=j and abs(lines_message_set[j][6] - lines_message_set[i][6]) < 0.1 and point2line_distance(lines_message_set[i][2], lines_message_set[i][3], lines_message_set[i][4], lines_message_set[i][5], (lines_message_set[j][2]+lines_message_set[j][4])/2, (lines_message_set[j][3]+lines_message_set[j][5])/2) <= 10:repeat_num_set.append(i)print lines_cluster_set#对直角的横线、竖线进行分析,缺省时根据表格类型进行矫正Point_heng = []Point_shu = []MiddlePoint_heng = (112,1450+(rows-1450)/2)MiddlePoint_shu = (75,1450+(rows-1450)/4)distance2point = rows-1450distance2point2 = rows-1450for j in range(0,len(lines_cluster_set)):if abs(lines_cluster_set[j][6]) < 1:if ((MiddlePoint_heng[0]-(lines_cluster_set[j][2]+lines_cluster_set[j][4])/2)**2 + (MiddlePoint_heng[1]-(lines_cluster_set[j][3]+lines_cluster_set[j][5])/2)**2)**0.5 < distance2point:distance2point = ((MiddlePoint_heng[0]-(lines_cluster_set[j][2]+lines_cluster_set[j][4])/2)**2 + (MiddlePoint_heng[1]-(lines_cluster_set[j][3]+lines_cluster_set[j][5])/2)**2)**0.5Point_heng = lines_cluster_set[j]else:if ((MiddlePoint_shu[0]-(lines_cluster_set[j][2]+lines_cluster_set[j][4])/2)**2 + (MiddlePoint_shu[1]-(lines_cluster_set[j][3]+lines_cluster_set[j][5])/2)**2)**0.5 < distance2point2:distance2point2 = ((MiddlePoint_shu[0]-(lines_cluster_set[j][2]+lines_cluster_set[j][4])/2)**2 + (MiddlePoint_shu[1]-(lines_cluster_set[j][3]+lines_cluster_set[j][5])/2)**2)**0.5Point_shu = lines_cluster_set[j]need_stronger = 0something_missing = 0#缺省矫正if Point_shu != []:if Point_heng == []:cross_x = 78cross_y = 1616something_missing = 1if len(docu_num) == 9:something_missing = 0cross_x = 93cross_y = 1666if docu_num[3] == '4':something_missing = 0cross_x = 78cross_y = 1655if docu_num[3] == '4' and docu_num[5] == '6':cross_x = 78cross_y = 1665else:cross_x = (Point_shu[3]-Point_shu[6]*Point_shu[2]-Point_heng[3]+Point_heng[6]*Point_heng[2])/(Point_heng[6]-Point_shu[6])cross_y = Point_heng[6]*cross_x + Point_heng[3] - Point_heng[6]*Point_heng[2]cv.line(image_copy, (Point_heng[2], Point_heng[3]), (Point_heng[4], Point_heng[5]), (0, 0, 255), 2)cv.line(image_copy, (Point_shu[2], Point_shu[3]), (Point_shu[4], Point_shu[5]), (0, 0, 255), 2)else:cross_x = Point_heng[2]cross_y = Point_heng[3]need_stronger = 1if Point_heng != [] and cross_x > Point_heng[2] and len(docu_num) == 9 and docu_num[6] == '0' and docu_num[7] == '9':cross_x = 50cross_y = 1586if Point_heng != [] and cross_x > Point_heng[2] and len(docu_num) == 9 and docu_num[6] == '1' and docu_num[7] == '8':cross_x = 50cross_y = 1616print 'roi_mean:',roi_meanif len(docu_num) == 9 and docu_num[7] == '8' and roi_mean < 230:cross_x = 78cross_y = 1631if cross_y - 1648 < 3:if len(docu_num) == 8 and docu_num[5] == '2' and docu_num[6] == '4':cross_x = 78cross_y = 1601if len(docu_num) == 9 and docu_num[6] == '1' and docu_num[7] == '4':cross_x = 78cross_y = 1621if table_style == 3 and cross_y < 1485:cross_x = 78cross_y = 1641print cross_x,cross_y#当下直角顶点位置,标准位置为78,1581cv.circle(image_copy, (int(cross_x), int(cross_y)), 3,(255,0,0),3)cv.rectangle(image_copy,(0,1450),(180,rows),(255,0,0),3)cv.imwrite('line_detect_possible_demo.jpg', image_copy)rows, cols, channels = ShrinkTranslationRotation.shapeprint rows, colsaffineShrinkTranslation = np.array([[1, 0, int(78 - cross_x)], [0, 1, int(1581 - cross_y)]], np.float32)#affineShrinkTranslation = np.array([[1, 0, int(78 - 78)], [0, 1, int(1581 - 1581)]], np.float32)shrinkTwoTimesTranslation = cv.warpAffine(ShrinkTranslationRotation, affineShrinkTranslation, (cols, rows))image_copy = cv.warpAffine(image_copy, affineShrinkTranslation, (cols, rows))##### 对201100XXX中左下角表格内有额外竖线的表格进行检测Detect_not_shu_line_in_201100XXX #####if table_style == 2:shrinkTwoTimesTranslation_copy_copy = shrinkTwoTimesTranslation.copy()roi = shrinkTwoTimesTranslation[1350:1548, 125:300]#180cv.rectangle(image_copy,(125,1350),(300,1548),(255,0,0),3)gray = cv.cvtColor(roi, cv.COLOR_RGB2GRAY)edges = cv.Canny(gray, 50, 150, apertureSize=3)  # 50,150,3lines = []lines = cv.HoughLinesP(edges, 1.0, np.pi / 180, 100, 0, minLineLength=60,maxLineGap=20)#50,10,20print linesgo_through = 0try:if lines == None:go_through = 0except:go_through = 1if go_through == 1:pi = 3.1415theta_total = 0theta_count = 0for line in lines:x1, y1, x2, y2 = line[0]x1 = x1 + 125x2 = x2 + 125y1 = y1 + 1350y2 = y2 + 1350rho = np.sqrt((x1 - x2)**2 + (y1 - y2)**2)theta = math.atan(float(y2 - y1)/float(x2 - x1 + 0.001))print(rho, theta, x1, y1, x2, y2)if theta > pi/3 or theta < -pi/3:table_style = 1if len(docu_num) == 9 and docu_num[6] == '1' and docu_num[7] == '1':jiaozheng = 1cv.line(shrinkTwoTimesTranslation_copy_copy, (x1, y1), (x2, y2), (255, 0, 0), 2)cv.imwrite('shrinkTwoTimesTranslation_copy_copy.jpg', shrinkTwoTimesTranslation_copy_copy)##### 提取表格Table_Out ######分别通过对二值化后的表格用长横条、长竖条内核进行开操作,将表格分别化为全横线与全竖线,叠加后提取交点,即可得到表格中每个矩形的四个顶点shrinkTwoTimesTranslation_gray = cv.cvtColor(shrinkTwoTimesTranslation, cv.COLOR_RGB2GRAY)th2 = cv.adaptiveThreshold(~shrinkTwoTimesTranslation_gray,255,cv.ADAPTIVE_THRESH_MEAN_C,cv.THRESH_BINARY,15,-2)cv.imwrite('th2.jpg', th2)#长横条内核处理shrinkTwoTimesTranslation_copy = shrinkTwoTimesTranslation.copy()th2_copy = th2.copy()scale = 44;#20,50,45,40,44rows,cols = shrinkTwoTimesTranslation_gray.shapehorizontalsize = cols / scalehorizontalStructure = cv.getStructuringElement(cv.MORPH_RECT, (horizontalsize, 1))erosion = cv.erode(th2,horizontalStructure,iterations = 1)dilation = cv.dilate(erosion,horizontalStructure,iterations = 1)#长竖条内核处理scale = 39;#20,50,45,40,39horizontalsize2 = rows / scalehorizontalStructure2 = cv.getStructuringElement(cv.MORPH_RECT, (1,horizontalsize2))erosion2 = cv.erode(th2_copy,horizontalStructure2,iterations = 1)dilation2 = cv.dilate(erosion2,horizontalStructure2,iterations = 1)#全横线图与全竖线图叠加,并提取交点mask = dilation + dilation2cv.imwrite('mask.jpg', mask)joints = cv.bitwise_and(dilation, dilation2)cv.imwrite('joints.jpg', joints)#根据矩形大小筛选矩形框,并画在矫正后的表格上#cv.RETR_LIST,cv.CHAIN_APPROX_SIMPLEmask, contours, hierarchy = cv.findContours(mask,cv.RETR_LIST,cv.CHAIN_APPROX_SIMPLE)length = len(contours)print lengthsmall_rects = []big_rects = []for i in range(length):cnt = contours[i]area = cv.contourArea(cnt)if area < 10:continueapprox = cv.approxPolyDP(cnt, 3, True)#3x, y, w, h = cv.boundingRect(approx)rect = (x, y, w, h)#cv.rectangle(img, (x, y), (x+w, y+h), (0, 0, 255), 3)roi = joints[y:y+h, x:x+w]roi, joints_contours, joints_hierarchy = cv.findContours(roi,cv.RETR_LIST,cv.CHAIN_APPROX_SIMPLE)#print len(joints_contours)#if h < 80 and h > 20 and w > 10 and len(joints_contours)<=4:if h < 80 and h > 20 and w > 10 and len(joints_contours)<=6:#importantcv.rectangle(image_copy, (x, y), (x+w, y+h), (255-h*3, h*3, 0), 3)small_rects.append(rect)cv.imwrite('table_out.jpg', image_copy)##### 在不同类型的表格中根据信息所在的大致位置获取相应矩形框的坐标 ######矫正后的表格中信息的大致位置各在一定范围内,根据大致位置的坐标点筛选出该表中该信息对应的矩形框具体坐标request_info_set = []#存储筛选出的所需矩形框坐标#pure_for_message = shrinkTwoTimesTranslation.copy()print "table_style:", table_styleif table_style == 1:request_info = (770, 150, 1150-770, 300-150)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(770,150),(1150,300),(255,0,0),1)#登记表编号0for j in range(0,len(small_rects)):(x, y, w, h) = small_rects[j]if x < 1060 and y < 1130 and x+w > 1060 and y+h > 1130 and something_missing == 1:#特殊情况下表格矩形的提取x_rem = xy_rem = yw_rem = wh_rem = hfor j in range(0,len(small_rects)):(x, y, w, h) = small_rects[j]if x < 700 and y < 370 and x+w > 700 and y+h > 370 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#性别1if x < 880 and y < 370 and x+w > 880 and y+h > 370 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#民族2if x < 685 and y < 412 and x+w > 685 and y+h > 412 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#体重3if x < 880 and y < 412 and x+w > 880 and y+h > 412 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(0,0,255),1)#血型4if x < 890 and y < 452 and x+w > 890 and y+h > 452 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#籍贯5if x < 486 and y < 1090 and x+w > 486 and y+h > 1090 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中学校名称6if x < 696 and y < 1090 and x+w > 696 and y+h > 1090 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中专业7if x < 808 and y < 1090 and x+w > 808 and y+h > 1090 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中学位8if x < 931 and y < 1090 and x+w > 931 and y+h > 1090 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中起止时间9if x < 1060 and y < 1087 and x+w > 1060 and y+h > 1087 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中是否毕业10#提取是、否选项旁的正方形框if jiaozheng == 1 or (len(docu_num) == 9 and (docu_num[6] == '2' or docu_num[6] == '3')) or (len(docu_num) == 8 and ((docu_num[5] == '3' and (docu_num[6] == '3' or (docu_num[6] == '3' and (docu_num[7] == '5' or docu_num[7] == '6' or docu_num[7] == '7' or docu_num[7] == '8' or docu_num[7] == '9')) or docu_num[6] == '4' or docu_num[6] == '5' or docu_num[6] == '7' or docu_num[6] == '8' or docu_num[6] == '9')) or docu_num[5] == '4')):gaozhong_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.18*w+x):int(0.28*w+x)]gaozhong_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.59*w+x):int(0.69*w+x)]cv.imwrite('gaozhong_roi_left.jpg', gaozhong_roi_left)cv.imwrite('gaozhong_roi_right.jpg', gaozhong_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.18*w+x),int(y+h/3.5)),(int(0.28*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.59*w+x),int(y+h/3.5)),(int(0.69*w+x),int(y+2.2*h/3)),(255,0,0),1)else:gaozhong_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]gaozhong_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('gaozhong_roi_left.jpg', gaozhong_roi_left)cv.imwrite('gaozhong_roi_right.jpg', gaozhong_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)if x < 486 and y < 1130 and x+w > 486 and y+h > 1130 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专学校名称11if x < 696 and y < 1130 and x+w > 696 and y+h > 1130 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专专业12if x < 808 and y < 1130 and x+w > 808 and y+h > 1130 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专学位13if x < 931 and y < 1130 and x+w > 931 and y+h > 1130 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专起止时间14if x < 1060 and y < 1127 and x+w > 1060 and y+h > 1127 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专是否毕业15#提取是、否选项旁的正方形框if jiaozheng == 1 or (len(docu_num) == 9 and (docu_num[6] == '2' or docu_num[6] == '3')) or (len(docu_num) == 8 and ((docu_num[5] == '3' and (docu_num[6] == '3' or (docu_num[6] == '3' and (docu_num[7] == '5' or docu_num[7] == '6' or docu_num[7] == '7' or docu_num[7] == '8' or docu_num[7] == '9')) or docu_num[6] == '4' or docu_num[6] == '5' or docu_num[6] == '7' or docu_num[6] == '8' or docu_num[6] == '9')) or docu_num[5] == '4')):dazhuan_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.18*w+x):int(0.28*w+x)]dazhuan_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.59*w+x):int(0.69*w+x)]cv.imwrite('dazhuan_roi_left.jpg', dazhuan_roi_left)cv.imwrite('dazhuan_roi_right.jpg', dazhuan_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.18*w+x),int(y+h/3.5)),(int(0.28*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.59*w+x),int(y+h/3.5)),(int(0.69*w+x),int(y+2.2*h/3)),(255,0,0),1)else:dazhuan_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]dazhuan_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('dazhuan_roi_left.jpg', dazhuan_roi_left)cv.imwrite('dazhuan_roi_right.jpg', dazhuan_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)if x < 486 and y < 1170 and x+w > 486 and y+h > 1170 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科学校名称16if x < 696 and y < 1170 and x+w > 696 and y+h > 1170 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科专业17if x < 808 and y < 1170 and x+w > 808 and y+h > 1170 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科学位18if x < 931 and y < 1170 and x+w > 931 and y+h > 1170 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科起止时间19if x < 1060 and y < 1167 and x+w > 1060 and y+h > 1167 or something_missing == 1:if something_missing == 1:#特殊情况下表格矩形的提取x = x_remy = y_rem + 40w = w_remh = h_remelse:request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科是否毕业20#提取是、否选项旁的正方形框if jiaozheng == 1 or (len(docu_num) == 9 and (docu_num[6] == '2' or docu_num[6] == '3')) or (len(docu_num) == 8 and ((docu_num[5] == '3' and (docu_num[6] == '3' or (docu_num[6] == '3' and (docu_num[7] == '5' or docu_num[7] == '6' or docu_num[7] == '7' or docu_num[7] == '8' or docu_num[7] == '9')) or docu_num[6] == '4' or docu_num[6] == '5' or docu_num[6] == '7' or docu_num[6] == '8' or docu_num[6] == '9')) or docu_num[5] == '4')):benke_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.18*w+x):int(0.28*w+x)]benke_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.59*w+x):int(0.69*w+x)]cv.imwrite('benke_roi_left.jpg', benke_roi_left)cv.imwrite('benke_roi_right.jpg', benke_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.18*w+x),int(y+h/3.5)),(int(0.28*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.59*w+x),int(y+h/3.5)),(int(0.69*w+x),int(y+2.2*h/3)),(255,0,0),1)else:benke_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]benke_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('benke_roi_left.jpg', benke_roi_left)cv.imwrite('benke_roi_right.jpg', benke_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)if x < 486 and y < 1210 and x+w > 486 and y+h > 1210 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生学校名称21if x < 696 and y < 1210 and x+w > 696 and y+h > 1210 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生专业22if x < 808 and y < 1210 and x+w > 808 and y+h > 1210 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生学位23if x < 931 and y < 1210 and x+w > 931 and y+h > 1210 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生起止时间24if x < 1060 and y < 1207 and x+w > 1060 and y+h > 1207 or something_missing == 1:if something_missing == 1:#特殊情况下表格矩形的提取x = x_remy = y_rem + 80w = w_remh = h_remelse:request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生是否毕业25#提取是、否选项旁的正方形框if jiaozheng == 1 or (len(docu_num) == 9 and (docu_num[6] == '2' or docu_num[6] == '3')) or (len(docu_num) == 8 and ((docu_num[5] == '3' and (docu_num[6] == '3' or (docu_num[6] == '3' and (docu_num[7] == '5' or docu_num[7] == '6' or docu_num[7] == '7' or docu_num[7] == '8' or docu_num[7] == '9')) or docu_num[6] == '4' or docu_num[6] == '5' or docu_num[6] == '7' or docu_num[6] == '8' or docu_num[6] == '9')) or docu_num[5] == '4')):yanjiu_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.18*w+x):int(0.28*w+x)]yanjiu_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.59*w+x):int(0.69*w+x)]cv.imwrite('yanjiu_roi_left.jpg', yanjiu_roi_left)cv.imwrite('yanjiu_roi_right.jpg', yanjiu_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.18*w+x),int(y+h/3.5)),(int(0.28*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.59*w+x),int(y+h/3.5)),(int(0.69*w+x),int(y+2.2*h/3)),(255,0,0),1)else:yanjiu_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]yanjiu_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('yanjiu_roi_left.jpg', yanjiu_roi_left)cv.imwrite('yanjiu_roi_right.jpg', yanjiu_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)#特殊情况下表格矩形的提取if need_stronger == 1:cv.rectangle(shrinkTwoTimesTranslation,(673,385),(740,420),(0,0,255),1)#性别cv.rectangle(shrinkTwoTimesTranslation,(845,385),(938,420),(0,0,255),1)#民族cv.rectangle(shrinkTwoTimesTranslation,(673,425),(720,460),(0,0,255),1)#体重cv.rectangle(shrinkTwoTimesTranslation,(845,425),(938,465),(0,0,255),1)#血型cv.rectangle(shrinkTwoTimesTranslation,(845,470),(938,510),(0,0,255),1)#籍贯cv.rectangle(shrinkTwoTimesTranslation,(973,1095),(1153,1135),(0,0,255),1)#高中是否毕业x = 973y = 1095w = 1153 - 973h = 1135 - 1095gaozhong_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]gaozhong_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('gaozhong_roi_left.jpg', gaozhong_roi_left)cv.imwrite('gaozhong_roi_right.jpg', gaozhong_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(973,1135),(1153,1175),(0,0,255),1)x = 973y = 1135w = 1153 - 973h = 1175 - 1135dazhuan_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]dazhuan_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('dazhuan_roi_left.jpg', dazhuan_roi_left)cv.imwrite('dazhuan_roi_right.jpg', dazhuan_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(973,1175),(1153,1215),(0,0,255),1)x = 973y = 1175w = 1153 - 973h = 1215 - 1175benke_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]benke_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('benke_roi_left.jpg', benke_roi_left)cv.imwrite('benke_roi_right.jpg', benke_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(973,1220),(1153,1260),(0,0,255),1)x = 973y = 1220w = 1153 - 973h = 1260 - 1220yanjiu_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]yanjiu_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('yanjiu_roi_left.jpg', yanjiu_roi_left)cv.imwrite('yanjiu_roi_right.jpg', yanjiu_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)#测试各信息的大致位置#cv.circle(shrinkTwoTimesTranslation, (700, 335), 3,(255,0,0),3)#cv.circle(shrinkTwoTimesTranslation, (75, 1581), 3,(255,0,0),3)#cv.rectangle(shrinkTwoTimesTranslation,(663,360),(740,400),(255,0,0),1)#性别#cv.rectangle(shrinkTwoTimesTranslation,(835,360),(928,400),(255,0,0),1)#民族#cv.rectangle(shrinkTwoTimesTranslation,(663,402),(710,442),(255,0,0),1)#体重#cv.rectangle(shrinkTwoTimesTranslation,(835,402),(928,442),(255,0,0),1)#血型#cv.rectangle(shrinkTwoTimesTranslation,(835,443),(948,482),(255,0,0),1)#籍贯#cv.rectangle(shrinkTwoTimesTranslation,(345,1075),(627,1115),(255,0,0),1)#高中学校名称#cv.rectangle(shrinkTwoTimesTranslation,(628,1075),(765,1115),(255,0,0),1)#高中专业#cv.rectangle(shrinkTwoTimesTranslation,(766,1075),(850,1115),(255,0,0),1)#高中学位#cv.rectangle(shrinkTwoTimesTranslation,(851,1075),(1011,1115),(255,0,0),1)#高中起止时间#cv.rectangle(shrinkTwoTimesTranslation,(971,1070),(1150,1110),(255,0,0),1)#高中是否毕业if table_style == 2:request_info = (770, 150, 1150-770, 300-150)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(770,150),(1150,300),(255,0,0),1)#登记表编号0for j in range(0,len(small_rects)):(x, y, w, h) = small_rects[j]if x < 700 and y < 335 and x+w > 700 and y+h > 335 :#350-20request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#性别1if x < 885 and y < 335 and x+w > 885 and y+h > 335 :#x+5request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#民族2if x < 685 and y < 377 and x+w > 685 and y+h > 377 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#体重3if x < 885 and y < 377 and x+w > 885 and y+h > 377 :#x+5request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(0,0,255),1)#血型4if x < 890 and y < 417 and x+w > 890 and y+h > 417 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#籍贯5if x < 486 and y < 1045 and x+w > 486 and y+h > 1045 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中学校名称6if x < 696 and y < 1045 and x+w > 696 and y+h > 1045 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中专业7if x < 808 and y < 1045 and x+w > 808 and y+h > 1045 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中学位8if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')):y_xiuzheng = 1090else:y_xiuzheng = 1055if x < 931 and y < y_xiuzheng and x+w > 931 and y+h > y_xiuzheng :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中起止时间9if x < 1060 and y < y_xiuzheng and x+w > 1060 and y+h > y_xiuzheng :#y+10request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中是否毕业10if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')) or (docu_num[6] == '3' and docu_num[8] == '9'):gaozhong_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.22*w+x):int(0.32*w+x)]gaozhong_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.55*w+x):int(0.65*w+x)]cv.imwrite('gaozhong_roi_left.jpg', gaozhong_roi_left)cv.imwrite('gaozhong_roi_right.jpg', gaozhong_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.22*w+x),int(y+h/3.5)),(int(0.32*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.55*w+x),int(y+h/3.5)),(int(0.65*w+x),int(y+2.2*h/3)),(255,0,0),1)else:gaozhong_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]gaozhong_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('gaozhong_roi_left.jpg', gaozhong_roi_left)cv.imwrite('gaozhong_roi_right.jpg', gaozhong_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)if x < 486 and y < 1085 and x+w > 486 and y+h > 1085 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专学校名称11if x < 696 and y < 1085 and x+w > 696 and y+h > 1085 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专专业12if x < 808 and y < 1085 and x+w > 808 and y+h > 1085 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专学位13if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')):y_xiuzheng = 1140else:y_xiuzheng = 1095if x < 931 and y < y_xiuzheng and x+w > 931 and y+h > y_xiuzheng :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专起止时间14if x < 1060 and y < y_xiuzheng and x+w > 1060 and y+h > y_xiuzheng :#y+10request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专是否毕业15if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')) or (docu_num[6] == '3' and docu_num[8] == '9'):dazhuan_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.22*w+x):int(0.32*w+x)]dazhuan_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.55*w+x):int(0.65*w+x)]cv.imwrite('dazhuan_roi_left.jpg', dazhuan_roi_left)cv.imwrite('dazhuan_roi_right.jpg', dazhuan_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.22*w+x),int(y+h/3.5)),(int(0.32*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.55*w+x),int(y+h/3.5)),(int(0.65*w+x),int(y+2.2*h/3)),(255,0,0),1)else:dazhuan_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]dazhuan_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('dazhuan_roi_left.jpg', dazhuan_roi_left)cv.imwrite('dazhuan_roi_right.jpg', dazhuan_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)if x < 486 and y < 1125 and x+w > 486 and y+h > 1125 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科学校名称16if x < 696 and y < 1125 and x+w > 696 and y+h > 1125 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科专业17if x < 808 and y < 1125 and x+w > 808 and y+h > 1125 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科学位18if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')):y_xiuzheng = 1190else:y_xiuzheng = 1135if x < 931 and y < y_xiuzheng and x+w > 931 and y+h > y_xiuzheng :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科起止时间19if x < 1060 and y < y_xiuzheng and x+w > 1060 and y+h > y_xiuzheng :#y+10request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科是否毕业20if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')) or (docu_num[6] == '3' and docu_num[8] == '9'):benke_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.22*w+x):int(0.32*w+x)]benke_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.55*w+x):int(0.65*w+x)]cv.imwrite('benke_roi_left.jpg', benke_roi_left)cv.imwrite('benke_roi_right.jpg', benke_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.22*w+x),int(y+h/3.5)),(int(0.32*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.55*w+x),int(y+h/3.5)),(int(0.65*w+x),int(y+2.2*h/3)),(255,0,0),1)else:benke_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]benke_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('benke_roi_left.jpg', benke_roi_left)cv.imwrite('benke_roi_right.jpg', benke_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)if x < 486 and y < 1165 and x+w > 486 and y+h > 1165 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生学校名称21if x < 696 and y < 1165 and x+w > 696 and y+h > 1165 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生专业22if x < 808 and y < 1165 and x+w > 808 and y+h > 1165 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生学位23if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')):y_xiuzheng = 1230else:y_xiuzheng = 1175if x < 931 and y < y_xiuzheng and x+w > 931 and y+h > y_xiuzheng :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生起止时间24if x < 1060 and y < y_xiuzheng and x+w > 1060 and y+h > y_xiuzheng :#y+10request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生是否毕业25if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')) or (docu_num[6] == '3' and docu_num[8] == '9'):yanjiu_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.22*w+x):int(0.32*w+x)]yanjiu_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.55*w+x):int(0.65*w+x)]cv.imwrite('yanjiu_roi_left.jpg', yanjiu_roi_left)cv.imwrite('yanjiu_roi_right.jpg', yanjiu_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.22*w+x),int(y+h/3.5)),(int(0.32*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.55*w+x),int(y+h/3.5)),(int(0.65*w+x),int(y+2.2*h/3)),(255,0,0),1)else:yanjiu_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.28*w+x):int(0.38*w+x)]yanjiu_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.2*h/3), int(0.51*w+x):int(0.61*w+x)]cv.imwrite('yanjiu_roi_left.jpg', yanjiu_roi_left)cv.imwrite('yanjiu_roi_right.jpg', yanjiu_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.28*w+x),int(y+h/3.5)),(int(0.38*w+x),int(y+2.2*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.51*w+x),int(y+h/3.5)),(int(0.61*w+x),int(y+2.2*h/3)),(255,0,0),1)if table_style == 3:request_info = (970, 50, 1200-970, 200-50)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(970,50),(1200,200),(255,0,0),1)#登记表编号0for j in range(0,len(small_rects)):(x, y, w, h) = small_rects[j]if x < 724 and y < 316 and x+w > 724 and y+h > 316 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#性别1if x < 960 and y < 316 and x+w > 960 and y+h > 316 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#民族2if x < 724 and y < 363 and x+w > 724 and y+h > 363 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#体重3if x < 960 and y < 363 and x+w > 960 and y+h > 363 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#血型4if x < 960 and y < 410 and x+w > 960 and y+h > 410 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#籍贯5if x < 981 and y < 1089 and x+w > 981 and y+h > 1089 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中起止时间9if x < 1105 and y < 1089 and x+w > 1105 and y+h > 1089 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中是否毕业10if docu_num[5] == '3' and docu_num[6] == '8':gaozhong_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.19*w+x):int(0.29*w+x)]gaozhong_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.59*w+x):int(0.69*w+x)]cv.imwrite('gaozhong_roi_left.jpg', gaozhong_roi_left)cv.imwrite('gaozhong_roi_right.jpg', gaozhong_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.19*w+x),int(y+h/3.5)),(int(0.29*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.59*w+x),int(y+h/3.5)),(int(0.69*w+x),int(y+2.1*h/3)),(255,0,0),1)else:gaozhong_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.23*w+x):int(0.33*w+x)]gaozhong_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.56*w+x):int(0.66*w+x)]cv.imwrite('gaozhong_roi_left.jpg', gaozhong_roi_left)cv.imwrite('gaozhong_roi_right.jpg', gaozhong_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.23*w+x),int(y+h/3.5)),(int(0.33*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.56*w+x),int(y+h/3.5)),(int(0.66*w+x),int(y+2.1*h/3)),(255,0,0),1)if x < 981 and y < 1137 and x+w > 981 and y+h > 1137 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专起止时间14if x < 1105 and y < 1137 and x+w > 1105 and y+h > 1137 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专是否毕业15if docu_num[5] == '3' and docu_num[6] == '8':dazhuan_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.19*w+x):int(0.29*w+x)]dazhuan_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.59*w+x):int(0.69*w+x)]cv.imwrite('dazhuan_roi_left.jpg', dazhuan_roi_left)cv.imwrite('dazhuan_roi_right.jpg', dazhuan_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.19*w+x),int(y+h/3.5)),(int(0.29*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.59*w+x),int(y+h/3.5)),(int(0.69*w+x),int(y+2.1*h/3)),(255,0,0),1)else:dazhuan_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.23*w+x):int(0.33*w+x)]dazhuan_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.56*w+x):int(0.66*w+x)]cv.imwrite('dazhuan_roi_left.jpg', dazhuan_roi_left)cv.imwrite('dazhuan_roi_right.jpg', dazhuan_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.23*w+x),int(y+h/3.5)),(int(0.33*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.56*w+x),int(y+h/3.5)),(int(0.66*w+x),int(y+2.1*h/3)),(255,0,0),1)if x < 981 and y < 1185 and x+w > 981 and y+h > 1185 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科起止时间19if x < 1105 and y < 1185 and x+w > 1105 and y+h > 1185 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科是否毕业20if docu_num[5] == '3' and docu_num[6] == '8':benke_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.19*w+x):int(0.29*w+x)]benke_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.59*w+x):int(0.69*w+x)]cv.imwrite('benke_roi_left.jpg', benke_roi_left)cv.imwrite('benke_roi_right.jpg', benke_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.19*w+x),int(y+h/3.5)),(int(0.29*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.59*w+x),int(y+h/3.5)),(int(0.69*w+x),int(y+2.1*h/3)),(255,0,0),1)else:benke_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.23*w+x):int(0.33*w+x)]benke_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.56*w+x):int(0.66*w+x)]cv.imwrite('benke_roi_left.jpg', benke_roi_left)cv.imwrite('benke_roi_right.jpg', benke_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.23*w+x),int(y+h/3.5)),(int(0.33*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.56*w+x),int(y+h/3.5)),(int(0.66*w+x),int(y+2.1*h/3)),(255,0,0),1)if x < 981 and y < 1233 and x+w > 981 and y+h > 1233 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生起止时间24if x < 1105 and y < 1233 and x+w > 1105 and y+h > 1233 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生是否毕业25if docu_num[5] == '3' and docu_num[6] == '8':yanjiu_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.19*w+x):int(0.29*w+x)]yanjiu_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.59*w+x):int(0.69*w+x)]cv.imwrite('yanjiu_roi_left.jpg', yanjiu_roi_left)cv.imwrite('yanjiu_roi_right.jpg', yanjiu_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.19*w+x),int(y+h/3.5)),(int(0.29*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.59*w+x),int(y+h/3.5)),(int(0.69*w+x),int(y+2.1*h/3)),(255,0,0),1)else:yanjiu_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.23*w+x):int(0.33*w+x)]yanjiu_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.56*w+x):int(0.66*w+x)]cv.imwrite('yanjiu_roi_left.jpg', yanjiu_roi_left)cv.imwrite('yanjiu_roi_right.jpg', yanjiu_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.23*w+x),int(y+h/3.5)),(int(0.33*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.56*w+x),int(y+h/3.5)),(int(0.66*w+x),int(y+2.1*h/3)),(255,0,0),1)if table_style == 4:request_info = (970, 50, 1200-970, 200-50)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(970,50),(1200,200),(255,0,0),1)#登记表编号0for j in range(0,len(small_rects)):(x, y, w, h) = small_rects[j]if x < 724 and y < 292 and x+w > 724 and y+h > 292 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#性别1if x < 960 and y < 292 and x+w > 960 and y+h > 292 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#民族2if x < 724 and y < 339 and x+w > 724 and y+h > 339 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#体重3if x < 960 and y < 339 and x+w > 960 and y+h > 339 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#血型4if x < 960 and y < 386 and x+w > 960 and y+h > 386 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#籍贯5if x < 981 and y < 1094 and x+w > 981 and y+h > 1094 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中起止时间9if x < 1105 and y < 1094 and x+w > 1105 and y+h > 1094 :#y-15request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#高中是否毕业10gaozhong_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.23*w+x):int(0.33*w+x)]gaozhong_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.56*w+x):int(0.66*w+x)]cv.imwrite('gaozhong_roi_left.jpg', gaozhong_roi_left)cv.imwrite('gaozhong_roi_right.jpg', gaozhong_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.23*w+x),int(y+h/3.5)),(int(0.33*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.56*w+x),int(y+h/3.5)),(int(0.66*w+x),int(y+2.1*h/3)),(255,0,0),1)if x < 981 and y < 1142 and x+w > 981 and y+h > 1142 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专起止时间14if x < 1105 and y < 1142 and x+w > 1105 and y+h > 1142 :#y-15request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#大专是否毕业15dazhuan_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.23*w+x):int(0.33*w+x)]dazhuan_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.56*w+x):int(0.66*w+x)]cv.imwrite('dazhuan_roi_left.jpg', dazhuan_roi_left)cv.imwrite('dazhuan_roi_right.jpg', dazhuan_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.23*w+x),int(y+h/3.5)),(int(0.33*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.56*w+x),int(y+h/3.5)),(int(0.66*w+x),int(y+2.1*h/3)),(255,0,0),1)if x < 981 and y < 1190 and x+w > 981 and y+h > 1190 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科起止时间19if x < 1105 and y < 1190 and x+w > 1105 and y+h > 1190 :#y-15request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#本科是否毕业20benke_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.23*w+x):int(0.33*w+x)]benke_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.56*w+x):int(0.66*w+x)]cv.imwrite('benke_roi_left.jpg', benke_roi_left)cv.imwrite('benke_roi_right.jpg', benke_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.23*w+x),int(y+h/3.5)),(int(0.33*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.56*w+x),int(y+h/3.5)),(int(0.66*w+x),int(y+2.1*h/3)),(255,0,0),1)if x < 981 and y < 1238 and x+w > 981 and y+h > 1238 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生起止时间24if x < 1105 and y < 1238 and x+w > 1105 and y+h > 1238 :#y-15request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#研究生是否毕业25yanjiu_roi_left = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.23*w+x):int(0.33*w+x)]yanjiu_roi_right = shrinkTwoTimesTranslation[int(y+h/3.5):int(y+2.1*h/3), int(0.56*w+x):int(0.66*w+x)]cv.imwrite('yanjiu_roi_left.jpg', yanjiu_roi_left)cv.imwrite('yanjiu_roi_right.jpg', yanjiu_roi_right)cv.rectangle(shrinkTwoTimesTranslation,(int(0.23*w+x),int(y+h/3.5)),(int(0.33*w+x),int(y+2.1*h/3)),(255,0,0),1)cv.rectangle(shrinkTwoTimesTranslation,(int(0.56*w+x),int(y+h/3.5)),(int(0.66*w+x),int(y+2.1*h/3)),(255,0,0),1)#cv.rectangle(shrinkTwoTimesTranslation,(663,269),(786,315),(255,0,0),1)#性别#cv.rectangle(shrinkTwoTimesTranslation,(900,269),(1020,315),(255,0,0),1)#民族#cv.rectangle(shrinkTwoTimesTranslation,(663,316),(786,362),(255,0,0),1)#体重#cv.rectangle(shrinkTwoTimesTranslation,(900,316),(1020,362),(255,0,0),1)#血型#cv.rectangle(shrinkTwoTimesTranslation,(900,363),(1020,409),(255,0,0),1)#籍贯#cv.rectangle(shrinkTwoTimesTranslation,(1016,1065),(1195,1113),(255,0,0),1)#高中是否毕业#登记表编号,性别,民族,体重,血型,籍贯,高中学校名称,高中专业,高中学位,高中起止时间,高中是否毕业,大专学校名称,大专专业,大专学位,大专起止时间,大专是否毕业,本科学校名称,本科专业,本科学位,本科起止时间,本科是否毕业,研究生学校名称,研究生专业,研究生学位,研究生起止时间,研究生是否毕业,学校名称(其它),专业(其它),学位(其它),起止时间(其它),是否毕业(其它)cv.imwrite('image_correct.jpg', shrinkTwoTimesTranslation)#关键信息框重点标注的矫正图像########### 去除噪点Filter ###########th2 = cv.medianBlur(th2,5)#5horizontalStructure = cv.getStructuringElement(cv.MORPH_RECT, (2, 2))#2,2th2 = cv.erode(th2,horizontalStructure,iterations = 1)#horizontalStructure = cv.getStructuringElement(cv.MORPH_RECT, (1, 3))#2,2#th2 = cv.dilate(th2,horizontalStructure,iterations = 1)########### 登记表编号0 ############(x, y, w, h) = request_info_set[0]#num_num = 8#num_result = num_out.num_o(th2, x, y, w, h, num_num)#print "登记表编号: ",num_result########### 识别 性别1 ###########if table_style == 1:x_com = 700#信息所在的大致位置y_com = 370if table_style == 2:x_com = 700y_com = 335if table_style == 3:x_com = 724y_com = 316if table_style == 4:x_com = 724y_com = 292if need_stronger == 1:#特殊情况处理x = 673y = 385w = 740 - 673h = 420 - 385get_all = 1cha_num = 3is_ABO = 0xingbie_result = ''return_change, xingbie_result_set = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)print xingbie_result_setif xingbie_result_set[0] == 2206 or xingbie_result_set[1] == 2206 or xingbie_result_set[2] == 2206:xingbie_result = '男'if xingbie_result_set[0] == 775 or xingbie_result_set[1] == 775 or xingbie_result_set[2] == 775:xingbie_result = '女'if xingbie_result == '':xingbie_result = '男'print "性别: ",xingbie_resultelse:for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < x_com and y < y_com and x+w > x_com and y+h > y_com :#提取信息所在的矩形框get_all = 1#对手写汉字模型的三个识别结果都进行分析cha_num = 3#字段预计最长长度is_ABO = 0#不是血型数据xingbie_result = ''#初始化#调用chinese_out.py中的chinese_o()函数进行手写汉字识别return_change, xingbie_result_set = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)print xingbie_result_set#如果三个识别结果中有一个'男',则确定为男性if xingbie_result_set[0] == 2206 or xingbie_result_set[1] == 2206 or xingbie_result_set[2] == 2206:xingbie_result = '男'#如果三个识别结果中有一个'女',则确定为女性if xingbie_result_set[0] == 775 or xingbie_result_set[1] == 775 or xingbie_result_set[2] == 775:xingbie_result = '女'#特殊情况处理if xingbie_result == '':xingbie_result = '男'print "性别: ",xingbie_result########### 识别 民族2 ###########if table_style == 1:x_com = 880y_com = 370if table_style == 2:x_com = 885y_com = 335if table_style == 3:x_com = 960y_com = 316if table_style == 4:x_com = 960y_com = 292minzu_result = '汉'if need_stronger == 1:x = 845y = 385w = 938 - 845h = 420 - 385get_all = 2cha_num = 1is_ABO = 0return_change, minzu_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)if return_change != 0:if minzu_result[0] == 1809 or minzu_result[1] == 1809 or minzu_result[2] == 1809:minzu_result = '汉'else:minzu_result = return_changeprint "民族: ",minzu_resultelse:for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < x_com and y < y_com and x+w > x_com and y+h > y_com :get_all = 2#对手写汉字模型的三个识别结果都进行分析,且标志当前为民族数据提取cha_num = 1#字段预计最长长度is_ABO = 0#不是血型数据#调用chinese_out.py中的chinese_o()函数进行手写汉字识别return_change, minzu_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)if return_change != 0:#特殊情况处理#如果三个识别结果中有一个'汉',则确定为汉,否则输出单字if minzu_result[0] == 1809 or minzu_result[1] == 1809 or minzu_result[2] == 1809:minzu_result = '汉'else:minzu_result = return_changeprint "民族: ",minzu_resultprint "民族: ",minzu_result########### 识别 体重3 ###########if table_style == 1:x_com = 685y_com = 412if table_style == 2:x_com = 685y_com = 377if table_style == 3:x_com = 724y_com = 363if table_style == 4:x_com = 724y_com = 339if need_stronger == 1:x = 673y = 425w = 720 - 673h = 460 - 425num_num = 3weight_result = num_out.num_o(th2_copy, x, y, int(w*0.6), h, num_num)if weight_result == '':weight_result = '无'print "体重: ",weight_resultelse:for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < x_com and y < y_com and x+w > x_com and y+h > y_com :num_num = 3#字段预计最长长度#调用num_out.py中的num_o()函数进行手写数字识别,体重框只取前60%部分(避免kg干扰)weight_result = num_out.num_o(th2_copy, x, y, int(w*0.6), h, num_num)#0.6#特殊情况处理if weight_result == '':weight_result = '无'if len(weight_result) > 3:weight_result = weight_result[0:2]#if weight_result[0] != '1':#weight_result = weight_result[0:1]print "体重: ",weight_result########### 识别 血型4 ###########if table_style == 1:x_com = 880y_com = 412if table_style == 2:x_com = 885y_com = 377if table_style == 3:x_com = 960y_com = 363if table_style == 4:x_com = 960y_com = 339xuexing_result = 'O'if need_stronger == 1:x = 845y = 425w = 938 - 845h = 465 - 425get_all = 0cha_num = 2is_ABO = 1return_change, xuexing_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)print xuexing_result, len(xuexing_result)if xuexing_result == '':xuexing_result = '无'if len(xuexing_result) >= 3 and (xuexing_result[2] == 'A' or xuexing_result[2] == 'B' or xuexing_result[2] == 'O'):if return_change != 2:xuexing_result = xuexing_result[0]if len(xuexing_result) == 2 and return_change != 2:xuexing_result = 'AB'print "血型: ",xuexing_resultelse:for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < x_com and y < y_com and x+w > x_com and y+h > y_com :get_all = 0#只分析模型的最佳识别结果cha_num = 2#字段预计最长长度is_ABO = 1#是血型数据#调用chinese_out.py中的chinese_o()函数进行手写血型识别return_change, xuexing_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)print xuexing_result, len(xuexing_result)#特殊情况处理if xuexing_result == '':xuexing_result = '无'if len(xuexing_result) >= 3 and (xuexing_result[2] == 'A' or xuexing_result[2] == 'B' or xuexing_result[2] == 'O'):if return_change != 2:xuexing_result = xuexing_result[0]if len(xuexing_result) == 2 and return_change != 2:xuexing_result = 'AB'print "血型: ",xuexing_resultprint "血型: ",xuexing_result########### 识别 籍贯5 ###########jiguan_result = '无'if table_style == 1:x_com = 890y_com = 452if table_style == 2:x_com = 890y_com = 417if table_style == 3:x_com = 960y_com = 410if table_style == 4:x_com = 960y_com = 386jiguan_result = ''try:if need_stronger == 1:x = 845y = 465w = 938 - 845h = 465 - 425pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 3is_ABO = 0return_change, jiguan_result = chinese_out_jiguan.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:jiguan_result = '无'print "籍贯: ",jiguan_resultelse:for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < x_com and y < y_com and x+w > x_com and y+h > y_com :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:#排除空框get_all = 0cha_num = 6#字段预计最长长度is_ABO = 0#调用chinese_out_jiguan.py中的chinese_o()函数进行手写汉字识别return_change, jiguan_result = chinese_out_jiguan.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:jiguan_result = '无'print "籍贯: ",jiguan_resultexcept:jiguan_result = '无'print "籍贯: ",jiguan_result########### 识别 高中学校名称6 ###########gaozhong_name_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 486 and y < 1090 and x+w > 486 and y+h > 1090 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 12is_ABO = 0gaozhong_name_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:gaozhong_name_result = '无'print "高中学校名称: ",gaozhong_name_result
'''########### 识别 高中专业7 ###########gaozhong_master_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 696 and y < 1090 and x+w > 696 and y+h > 1090 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 6is_ABO = 0gaozhong_master_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:gaozhong_master_result = '无'print "高中专业: ",gaozhong_master_result
'''########### 识别 高中学位8 ###########gaozhong_degree_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 808 and y < 1090 and x+w > 808 and y+h > 1090 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 3is_ABO = 0gaozhong_degree_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:gaozhong_degree_result = '无'print "高中学位: ",gaozhong_degree_result
'''########### 识别 高中起止时间9 ###########gaozhong_time_result = '无'if table_style == 1:x_com = 931y_com = 1090if table_style == 2:#特殊情况处理if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')):y_com = 1090else:y_com = 1055x_com = 931if table_style == 3:x_com = 981y_com = 1089if table_style == 4:x_com = 981y_com = 1094try:if need_stronger == 1:gaozhong_time_result = '无'else:for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < x_com and y < y_com and x+w > x_com and y+h > y_com :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:#排除空框num_num = 10#字段预计最长长度#调用time_out.py中的num_o()函数进行手写数字识别gaozhong_time_result = time_out.num_o(th2_copy, x, y, w, h, num_num)else:gaozhong_time_result = '无'print "高中起止时间: ",gaozhong_time_resultexcept:gaozhong_time_result = '无'print "高中起止时间: ",gaozhong_time_result########### 识别 高中是否毕业10 ###########left_mean = cv.mean(gaozhong_roi_left)#求提取出的正方形区域均值right_mean = cv.mean(gaozhong_roi_right)print left_mean[0],right_mean[0]mean_diff = 14#两者间的差值超过该值则判断为是、否,否则判断为两个空框if left_mean[0] < right_mean[0] - mean_diff:gaozhong_biye_result = '是'else:if right_mean[0] < left_mean[0] - mean_diff:gaozhong_biye_result = '否'else:gaozhong_biye_result = '无'print "高中是否毕业: ",gaozhong_biye_result########### 识别 大专学校名称11 ###########dazhuan_name_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 486 and y < 1130 and x+w > 486 and y+h > 1130 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 12is_ABO = 0dazhuan_name_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:dazhuan_name_result = '无'print "大专学校名称: ",dazhuan_name_result
'''########### 识别 大专专业12 ###########dazhuan_master_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 696 and y < 1130 and x+w > 696 and y+h > 1130 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 6is_ABO = 0dazhuan_master_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:dazhuan_master_result = '无'print "大专专业: ",dazhuan_master_result
'''########### 识别 大专学位13 ###########dazhuan_degree_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 808 and y < 1130 and x+w > 808 and y+h > 1130 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 3is_ABO = 0dazhuan_degree_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:dazhuan_degree_result = '无'print "大专学位: ",dazhuan_degree_result
'''########### 识别 大专起止时间14 ###########dazhuan_time_result = '无'if table_style == 1:x_com = 931y_com = 1130if table_style == 2:if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')):y_com = 1140else:y_com = 1095x_com = 931if table_style == 3:x_com = 981y_com = 1137if table_style == 4:x_com = 981y_com = 1142try:if need_stronger == 1:dazhuan_time_result = '无'else:for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < x_com and y < y_com and x+w > x_com and y+h > y_com :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:num_num = 10#调用time_out.py中的num_o()函数进行手写数字识别dazhuan_time_result = time_out.num_o(th2_copy, x, y, w, h, num_num)else:dazhuan_time_result = '无'print "大专起止时间: ",dazhuan_time_resultexcept:dazhuan_time_result = '无'print "大专起止时间: ",dazhuan_time_result########### 识别 大专是否毕业15 ###########left_mean = cv.mean(dazhuan_roi_left)right_mean = cv.mean(dazhuan_roi_right)print left_mean[0],right_mean[0]mean_diff = 14#两者间的差值超过该值则判断为是、否,否则判断为两个空框if left_mean[0] < right_mean[0] - mean_diff:dazhuan_biye_result = '是'else:if right_mean[0] < left_mean[0] - mean_diff:dazhuan_biye_result = '否'else:dazhuan_biye_result = '无'print "大专是否毕业: ",dazhuan_biye_result########### 识别 本科学校名称16 ###########benke_name_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 486 and y < 1170 and x+w > 486 and y+h > 1170 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 12is_ABO = 0benke_name_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:benke_name_result = '无'print "本科学校名称: ",benke_name_result
'''########### 识别 本科专业17 ###########benke_master_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 696 and y < 1170 and x+w > 696 and y+h > 1170 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 6is_ABO = 0benke_master_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:benke_master_result = '无'print "本科专业: ",benke_master_result
'''########### 识别 本科学位18 ###########benke_degree_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 808 and y < 1170 and x+w > 808 and y+h > 1170 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 3is_ABO = 0benke_degree_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:benke_degree_result = '无'print "本科学位: ",benke_degree_result
'''########### 识别 本科起止时间19 ###########benke_time_result = '无'if table_style == 1:x_com = 931y_com = 1170if table_style == 2:if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')):y_com = 1190else:y_com = 1135x_com = 931if table_style == 3:x_com = 981y_com = 1185if table_style == 4:x_com = 981y_com = 1190try:if need_stronger == 1:benke_time_result = '无'else:for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < x_com and y < y_com and x+w > x_com and y+h > y_com :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:num_num = 10#调用time_out.py中的num_o()函数进行手写数字识别benke_time_result = time_out.num_o(th2_copy, x, y, w, h, num_num)else:benke_time_result = '无'print "本科起止时间: ",benke_time_resultexcept:benke_time_result = '无'print "本科起止时间: ",benke_time_result########### 识别 本科是否毕业20 ###########left_mean = cv.mean(benke_roi_left)right_mean = cv.mean(benke_roi_right)print left_mean[0],right_mean[0]mean_diff = 14#两者间的差值超过该值则判断为是、否,否则判断为两个空框if left_mean[0] < right_mean[0] - mean_diff:benke_biye_result = '是'else:if right_mean[0] < left_mean[0] - mean_diff:benke_biye_result = '否'else:benke_biye_result = '无'print "本科是否毕业: ",benke_biye_result########### 识别 研究生学校名称21 ###########yanjiu_name_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 486 and y < 1210 and x+w > 486 and y+h > 1210 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 12is_ABO = 0yanjiu_name_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:yanjiu_name_result = '无'print "研究生学校名称: ",yanjiu_name_result
'''########### 识别 研究生专业22 ###########yanjiu_master_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 696 and y < 1210 and x+w > 696 and y+h > 1210 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 6is_ABO = 0yanjiu_master_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:yanjiu_master_result = '无'print "研究生专业: ",yanjiu_master_result
'''########### 识别 研究生学位23 ###########yanjiu_degree_result = '无'#功能已实现,因效果一般,暂时注释'''for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < 808 and y < 1210 and x+w > 808 and y+h > 1210 :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:get_all = 0cha_num = 3is_ABO = 0yanjiu_degree_result = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)else:yanjiu_degree_result = '无'print "研究生学位: ",yanjiu_degree_result
'''########### 识别 研究生起止时间24 ###########yanjiu_time_result = '无'if table_style == 1:x_com = 931y_com = 1210if table_style == 2:if len(docu_num) == 9 and (docu_num[6] == '2' and (docu_num[7] == '6' or docu_num[7] == '8' or docu_num[7] == '9')):y_com = 1230else:y_com = 1175x_com = 931if table_style == 3:x_com = 981y_com = 1233if table_style == 4:x_com = 981y_com = 1238try:if need_stronger == 1:yanjiu_time_result = '无'else:for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < x_com and y < y_com and x+w > x_com and y+h > y_com :pic_mean = cv.mean(th2_copy[y:y+h, x:x+w])print "pic_mean[0]:", pic_mean[0]if pic_mean[0] > 30:num_num = 10#调用time_out.py中的num_o()函数进行手写数字识别yanjiu_time_result = time_out.num_o(th2_copy, x, y, w, h, num_num)else:yanjiu_time_result = '无'print "研究生起止时间: ",yanjiu_time_resultexcept:yanjiu_time_result = '无'print "研究生起止时间: ",yanjiu_time_result########### 识别 研究生是否毕业25 ###########left_mean = cv.mean(yanjiu_roi_left)right_mean = cv.mean(yanjiu_roi_right)print left_mean[0],right_mean[0]mean_diff = 14#两者间的差值超过该值则判断为是、否,否则判断为两个空框if left_mean[0] < right_mean[0] - mean_diff:yanjiu_biye_result = '是'else:if right_mean[0] < left_mean[0] - mean_diff:yanjiu_biye_result = '否'else:yanjiu_biye_result = '无'print "研究生是否毕业: ",yanjiu_biye_result########### 识别 其它学校名称26 ###########qita_name_result = '无'########### 识别 其它专业27 ###########qita_master_result = '无'########### 识别 其它学位28 ###########qita_degree_result = '无'########### 识别 其它起止时间29 ###########qita_time_result = '无'########### 识别 其它是否毕业30 ###########qita_biye_result = '无'##### 向CSV文件写入数据 #####information = [docu_num, xingbie_result, minzu_result, weight_result, xuexing_result, jiguan_result, gaozhong_name_result, gaozhong_master_result, gaozhong_degree_result, gaozhong_time_result, gaozhong_biye_result, dazhuan_name_result, dazhuan_master_result, dazhuan_degree_result, dazhuan_time_result, dazhuan_biye_result, benke_name_result, benke_master_result, benke_degree_result, benke_time_result, benke_biye_result, yanjiu_name_result, yanjiu_master_result, yanjiu_degree_result, yanjiu_time_result, yanjiu_biye_result, qita_name_result, qita_master_result, qita_degree_result, qita_time_result, qita_biye_result]csvwrite.writerow(information)

够长吧哈哈,我们来分解一下上述代码中校正并提取表格中各个框的部分:

1、import包及汉字编码

#!/usr/bin/env python
# encoding: utf-8  import os
import cv2 as cv
import numpy as np
import math

如果python文件中含有汉字(包括注释),就一定要有#encoding:utf-8这一行,否则会报xe5的错。不要看有个注释符,其实这句话还是起作用的。import cv2 as cv即添加OpenCV的Python包,使用cv.()的方式即可调用OpenCV的函数,类似C++中的cv::()的形式。

2、读取表格图片文件

path = 'test_data'
for (path,dirs,files) in os.walk(path):for filename in files:#docu_num = '20110158' #测试单张登记表##### 按文件名读取登记表 #####(docu_num,extension) = os.path.splitext(filename)print "filename:", filename

这段即将test_data文件中的所有文件进行遍历读取,也可注释最下方的两行,并去除docu_num一行的注释,即可对一张图片文件进行反复的读取、处理。

3、统计图中长横线的斜率来判断整体需要旋转矫正的角度

     image = cv.imread('test_data/' + docu_num + '.jpg')rows, cols, channels = image.shapeprint rows, colsimage_copy = image.copy()##### 旋转校正Rotation ######统计图中长横线的斜率来判断整体需要旋转矫正的角度gray = cv.cvtColor(image, cv.COLOR_RGB2GRAY)if table_style == 1 or table_style == 3 or table_style == 2:edges = cv.Canny(gray, 50, 150, apertureSize=3)  # 50,150,3cv.imwrite('edges_whole.jpg', edges)lines = cv.HoughLinesP(edges, 1, np.pi / 180, 500, 0, minLineLength=50, maxLineGap=50)#650,50,20if table_style == 4:edges_gray = cv.Canny(gray, 50, 150, apertureSize=3)  # 50,150,3edges = edges_gray[400:1000, 0:1000]cv.imwrite('edges_whole.jpg', edges)lines = cv.HoughLinesP(edges, 1, np.pi / 180, 200, 0, minLineLength=50, maxLineGap=35)#650,50,20pi = 3.1415theta_total = 0theta_count = 0for line in lines:x1, y1, x2, y2 = line[0]if table_style == 4:y1 = y1 + 400y2 = y2 + 400rho = np.sqrt((x1 - x2)**2 + (y1 - y2)**2)theta = math.atan(float(y2 - y1)/float(x2 - x1 + 0.001))print(rho, theta, x1, y1, x2, y2)if theta < pi/4 and theta > -pi/4:theta_total = theta_total + thetatheta_count+=1cv.line(image_copy, (x1, y1), (x2, y2), (0, 0, 255), 2)#cv.line(edges, (x1, y1), (x2, y2), (0, 0, 0), 2)theta_average = theta_total/theta_countprint theta_average, theta_average*180/picv.imwrite('line_detect4rotation.jpg', image_copy)#cv.imwrite('line_detect4rotation.jpg', ~edges)#affineShrinkTranslationRotation = cv.getRotationMatrix2D((cols/2, rows/2), theta_average*180/pi, 1)affineShrinkTranslationRotation = cv.getRotationMatrix2D((0, rows), theta_average*180/pi, 1)ShrinkTranslationRotation = cv.warpAffine(image, affineShrinkTranslationRotation, (cols, rows))image_copy = cv.warpAffine(image_copy, affineShrinkTranslationRotation, (cols, rows))cv.imwrite('image_Rotation.jpg',ShrinkTranslationRotation)

这段做的是旋转校正操作,先把表格图片转换为灰度图,再用Canny算子提取边缘(灰度+Canny是提取边缘的标准操作)得到如下这张图片edges_whole.jpg:

可见,这张表格是倾斜的,需要对表格进行旋转。旋转需要参照物吧?笔者选择用霍夫变换HoughLinesP()对表格中的长直线进行识别并提取。然后再对这些长直线中斜率小于pi/4且大于-pi/4的直线进行筛选,统计它们的斜率平均值。这样除非这张表格的倾斜度超过45度,或者表格中含有人为划的长横线,这个斜率平均值都可以作为校正旋转的角度了。长直线筛选后的图片见下图line_detect4rotation.jpg:

之后便是用getRotationMatrix2D()、warpAffine()函数进行旋转变换,此处笔者拷贝了一份不进行绘图操作的图片(不然都涂花了,干嘛为难自己)。旋转后的表格图片如下image_Rotation.jpg:

4、通过对表格左下角直角进行识别,将其顶点统一平移矫正至(78,1581)

     ##### 平移校正Move ######通过对表格左下角直角进行识别,将其顶点统一平移矫正至(78,1581)#print "rows: ",rowsroi = image_copy[1450:rows, 0:150]#180gray = cv.cvtColor(roi, cv.COLOR_RGB2GRAY)edges = cv.Canny(gray, 50, 150, apertureSize=3)  # 50,150,3roi_mean_set = cv.mean(~edges[0:int((rows-1450)/2), 85:150])#通过区域灰度值特征排除文字对直线识别的干扰roi_mean = roi_mean_set[0]#cv.imwrite('edges_sample.jpg', edges)cv.imwrite('edges_sample.jpg', ~edges[0:int((rows-1450)/2), 75:150])lines = cv.HoughLinesP(edges, 1.0, np.pi / 180, 35, 0, minLineLength=10,maxLineGap=20)#50,10,20lines_message_set = []for line in lines: x1, y1, x2, y2 = line[0]y1 = y1 + 1450y2 = y2 + 1450rho = np.sqrt((x1 - x2)**2 + (y1 - y2)**2)xielv = (y2 - y1)/(x2 - x1 + 0.001)theta = math.atan(float(y2 - y1)/float(x2 - x1 + 0.001))print(rho, theta, x1, y1, x2, y2, xielv)lines_message = (rho, theta, x1, y1, x2, y2, xielv)#cv.line(image, (x1, y1), (x2, y2), (0, 0, 255), 2)lines_message_set.append(lines_message)#print len(lines_message_set)#求点到直线的距离def point2line_distance(x1,y1,x2,y2,pointPx,pointPy):A = y1 - y2B = x2 - x1C = x1*y2 - y1*x2distance = abs(A*pointPx + B*pointPy + C)/((A*A + B*B)**0.5)return distancelines_cluster_set = []repeat_num_set = []   for j in range(0,len(lines_message_set)):for i in range(0,len(lines_message_set)):if not(j in repeat_num_set):lines_cluster_set.append(lines_message_set[j])repeat_num_set.append(j)print point2line_distance(lines_message_set[i][2], lines_message_set[i][3], lines_message_set[i][4], lines_message_set[i][5], (lines_message_set[j][2]+lines_message_set[j][4])/2, (lines_message_set[j][3]+lines_message_set[j][5])/2)if i!=j and abs(lines_message_set[j][6] - lines_message_set[i][6]) < 0.1 and point2line_distance(lines_message_set[i][2], lines_message_set[i][3], lines_message_set[i][4], lines_message_set[i][5], (lines_message_set[j][2]+lines_message_set[j][4])/2, (lines_message_set[j][3]+lines_message_set[j][5])/2) <= 10:repeat_num_set.append(i)print lines_cluster_set#对直角的横线、竖线进行分析,缺省时根据表格类型进行矫正Point_heng = []Point_shu = []MiddlePoint_heng = (112,1450+(rows-1450)/2)MiddlePoint_shu = (75,1450+(rows-1450)/4)distance2point = rows-1450distance2point2 = rows-1450for j in range(0,len(lines_cluster_set)):if abs(lines_cluster_set[j][6]) < 1:if ((MiddlePoint_heng[0]-(lines_cluster_set[j][2]+lines_cluster_set[j][4])/2)**2 + (MiddlePoint_heng[1]-(lines_cluster_set[j][3]+lines_cluster_set[j][5])/2)**2)**0.5 < distance2point:distance2point = ((MiddlePoint_heng[0]-(lines_cluster_set[j][2]+lines_cluster_set[j][4])/2)**2 + (MiddlePoint_heng[1]-(lines_cluster_set[j][3]+lines_cluster_set[j][5])/2)**2)**0.5Point_heng = lines_cluster_set[j]else:if ((MiddlePoint_shu[0]-(lines_cluster_set[j][2]+lines_cluster_set[j][4])/2)**2 + (MiddlePoint_shu[1]-(lines_cluster_set[j][3]+lines_cluster_set[j][5])/2)**2)**0.5 < distance2point2:distance2point2 = ((MiddlePoint_shu[0]-(lines_cluster_set[j][2]+lines_cluster_set[j][4])/2)**2 + (MiddlePoint_shu[1]-(lines_cluster_set[j][3]+lines_cluster_set[j][5])/2)**2)**0.5Point_shu = lines_cluster_set[j]need_stronger = 0something_missing = 0#缺省矫正if Point_shu != []:if Point_heng == []:cross_x = 78cross_y = 1616something_missing = 1if len(docu_num) == 9:something_missing = 0cross_x = 93cross_y = 1666if docu_num[3] == '4':something_missing = 0cross_x = 78cross_y = 1655if docu_num[3] == '4' and docu_num[5] == '6':cross_x = 78cross_y = 1665else:cross_x = (Point_shu[3]-Point_shu[6]*Point_shu[2]-Point_heng[3]+Point_heng[6]*Point_heng[2])/(Point_heng[6]-Point_shu[6])cross_y = Point_heng[6]*cross_x + Point_heng[3] - Point_heng[6]*Point_heng[2]cv.line(image_copy, (Point_heng[2], Point_heng[3]), (Point_heng[4], Point_heng[5]), (0, 0, 255), 2)cv.line(image_copy, (Point_shu[2], Point_shu[3]), (Point_shu[4], Point_shu[5]), (0, 0, 255), 2)else:cross_x = Point_heng[2]cross_y = Point_heng[3]need_stronger = 1if Point_heng != [] and cross_x > Point_heng[2] and len(docu_num) == 9 and docu_num[6] == '0' and docu_num[7] == '9':cross_x = 50cross_y = 1586if Point_heng != [] and cross_x > Point_heng[2] and len(docu_num) == 9 and docu_num[6] == '1' and docu_num[7] == '8':cross_x = 50cross_y = 1616print 'roi_mean:',roi_meanif len(docu_num) == 9 and docu_num[7] == '8' and roi_mean < 230:cross_x = 78cross_y = 1631if cross_y - 1648 < 3:if len(docu_num) == 8 and docu_num[5] == '2' and docu_num[6] == '4':cross_x = 78cross_y = 1601if len(docu_num) == 9 and docu_num[6] == '1' and docu_num[7] == '4':cross_x = 78cross_y = 1621if table_style == 3 and cross_y < 1485:cross_x = 78cross_y = 1641print cross_x,cross_y#当下直角顶点位置,标准位置为78,1581cv.circle(image_copy, (int(cross_x), int(cross_y)), 3,(255,0,0),3)cv.rectangle(image_copy,(0,1450),(180,rows),(255,0,0),3)cv.imwrite('line_detect_possible_demo.jpg', image_copy)rows, cols, channels = ShrinkTranslationRotation.shapeprint rows, colsaffineShrinkTranslation = np.array([[1, 0, int(78 - cross_x)], [0, 1, int(1581 - cross_y)]], np.float32)#affineShrinkTranslation = np.array([[1, 0, int(78 - 78)], [0, 1, int(1581 - 1581)]], np.float32)shrinkTwoTimesTranslation = cv.warpAffine(ShrinkTranslationRotation, affineShrinkTranslation, (cols, rows))image_copy = cv.warpAffine(image_copy, affineShrinkTranslation, (cols, rows))##### 对201100XXX中左下角表格内有额外竖线的表格进行检测Detect_not_shu_line_in_201100XXX #####if table_style == 2:shrinkTwoTimesTranslation_copy_copy = shrinkTwoTimesTranslation.copy()roi = shrinkTwoTimesTranslation[1350:1548, 125:300]#180cv.rectangle(image_copy,(125,1350),(300,1548),(255,0,0),3)gray = cv.cvtColor(roi, cv.COLOR_RGB2GRAY)edges = cv.Canny(gray, 50, 150, apertureSize=3)  # 50,150,3lines = []lines = cv.HoughLinesP(edges, 1.0, np.pi / 180, 100, 0, minLineLength=60,maxLineGap=20)#50,10,20print linesgo_through = 0try:if lines == None:go_through = 0except:go_through = 1if go_through == 1:pi = 3.1415theta_total = 0theta_count = 0for line in lines:x1, y1, x2, y2 = line[0]x1 = x1 + 125x2 = x2 + 125y1 = y1 + 1350y2 = y2 + 1350rho = np.sqrt((x1 - x2)**2 + (y1 - y2)**2)theta = math.atan(float(y2 - y1)/float(x2 - x1 + 0.001))print(rho, theta, x1, y1, x2, y2)if theta > pi/3 or theta < -pi/3:table_style = 1if len(docu_num) == 9 and docu_num[6] == '1' and docu_num[7] == '1':jiaozheng = 1cv.line(shrinkTwoTimesTranslation_copy_copy, (x1, y1), (x2, y2), (255, 0, 0), 2)cv.imwrite('shrinkTwoTimesTranslation_copy_copy.jpg', shrinkTwoTimesTranslation_copy_copy)

这段做的是平移校正操作,其中大部分是为了应对数据集中的各种意外情况,譬如表格形式不同。但对于Github中给出的三幅较为统一的图片,核心的操作只有对旋转校正后的表格左下角进行横线、竖线用HoughLinesP()进行提取,归一后求二者的交点。将交点平移至设定好的坐标即可统筹处理所有表格图片。此处对霍夫变换求出的各条直线分别与设定好两个等分点进行距离求解,选出最靠近等分点的横线、竖线作为归一结果。求得交点坐标后用np.array()函数与cv.warpAffine()函数即可完成平移操作。

横线、竖线的交点求得后在旋转后的表格图像上作图如下line_detect_possible_demo.jpg:

5、分别通过对二值化后的表格用长横条、长竖条内核进行开操作,将表格分别化为全横线与全竖线,叠加后提取交点,即可得到表格中每个矩形的四个顶点

     ##### 提取表格Table_Out ######分别通过对二值化后的表格用长横条、长竖条内核进行开操作,将表格分别化为全横线与全竖线,叠加后提取交点,即可得到表格中每个矩形的四个顶点shrinkTwoTimesTranslation_gray = cv.cvtColor(shrinkTwoTimesTranslation, cv.COLOR_RGB2GRAY)th2 = cv.adaptiveThreshold(~shrinkTwoTimesTranslation_gray,255,cv.ADAPTIVE_THRESH_MEAN_C,cv.THRESH_BINARY,15,-2)cv.imwrite('th2.jpg', th2)#长横条内核处理shrinkTwoTimesTranslation_copy = shrinkTwoTimesTranslation.copy()th2_copy = th2.copy()scale = 44;#20,50,45,40,44rows,cols = shrinkTwoTimesTranslation_gray.shapehorizontalsize = cols / scalehorizontalStructure = cv.getStructuringElement(cv.MORPH_RECT, (horizontalsize, 1))erosion = cv.erode(th2,horizontalStructure,iterations = 1)dilation = cv.dilate(erosion,horizontalStructure,iterations = 1)#长竖条内核处理scale = 39;#20,50,45,40,39horizontalsize2 = rows / scalehorizontalStructure2 = cv.getStructuringElement(cv.MORPH_RECT, (1,horizontalsize2))erosion2 = cv.erode(th2_copy,horizontalStructure2,iterations = 1)dilation2 = cv.dilate(erosion2,horizontalStructure2,iterations = 1)#全横线图与全竖线图叠加,并提取交点mask = dilation + dilation2cv.imwrite('mask.jpg', mask)joints = cv.bitwise_and(dilation, dilation2)cv.imwrite('joints.jpg', joints)#根据矩形大小筛选矩形框,并画在矫正后的表格上#cv.RETR_LIST,cv.CHAIN_APPROX_SIMPLEmask, contours, hierarchy = cv.findContours(mask,cv.RETR_LIST,cv.CHAIN_APPROX_SIMPLE)length = len(contours)print lengthsmall_rects = []big_rects = []for i in range(length):cnt = contours[i]area = cv.contourArea(cnt)if area < 10:continueapprox = cv.approxPolyDP(cnt, 3, True)#3x, y, w, h = cv.boundingRect(approx)rect = (x, y, w, h)#cv.rectangle(img, (x, y), (x+w, y+h), (0, 0, 255), 3)roi = joints[y:y+h, x:x+w]roi, joints_contours, joints_hierarchy = cv.findContours(roi,cv.RETR_LIST,cv.CHAIN_APPROX_SIMPLE)#print len(joints_contours)#if h < 80 and h > 20 and w > 10 and len(joints_contours)<=4:if h < 80 and h > 20 and w > 10 and len(joints_contours)<=6:#importantcv.rectangle(image_copy, (x, y), (x+w, y+h), (255-h*3, h*3, 0), 3)small_rects.append(rect)cv.imwrite('table_out.jpg', image_copy)

此段是功能实现的核心操作:通过对校正图像进行横向、纵向的投影,并提取直线、直线交点,提取轮廓后通过所需表格框的尺寸数据对交点进行筛选、匹配,即可得到表格中的各个框四个顶点的坐标。此处理对表格中的小框效果拔群,对大型框进行提取可能遇到一框中有多个小框的情况,需要后续加以限制(统计该框内是否有别的顶点)。具体操作是:

对未涂花的原始图像进行灰度变化和自适应阈值的二值化操作adaptiveThreshold(),这样能最大可能地保留表格中的直线,效果如下图th2.jpg:

之后对图像进行先腐蚀erode()后膨胀dilate()的闭操作,腐蚀与膨胀分别使用N*1与1*N的形态学操作中的核进行处理。该操作可以理解为把表格中所有的像素进行横向、纵向的投影,且投影会往原先已存在直线上偏移,而文字状的不构成直线的像素则会在腐蚀操作中被抹消。就像从长条形的横、纵栅格中看表格图片一般。将横向、纵向投影好的图片进行叠加即如下图mask.jpg:

表格被神奇地提取出来了!这一步需要注意,一定要是旋转校正过的图像,因为对于一张倾斜的图像,投影操作会让图像变“正”了,实际上图像并没有被旋转过来,这对后续的坐标提取是致命的。既然已经提取出了表格框,那我们就可以通过bitwise_and()函数提取出表格框中的各个交点,如下图joints.jpg:

接下来即通过findContours()函数寻找图像轮廓,并用contourArea()函数求出各个轮廓的面积,以对过小的轮廓进行排除。通过approxPolyDP()与boundingRect()函数用矩形去包围各个轮廓,即可得到该表格图片中各个矩形框的位置数据。之后通过再次使用findContours()函数遍历图像轮廓,用尺寸数据对矩形框进行筛选即可得到表格中所需的框的集合。在校正图像上画出这个集合里的所有框,如下图table_out.jpg:

6、矫正后的表格中信息的大致位置各在一定范围内,根据大致位置的坐标点筛选出该表中该信息对应的矩形框具体坐标

     if table_style == 1:request_info = (770, 150, 1150-770, 300-150)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(770,150),(1150,300),(255,0,0),1)#登记表编号0for j in range(0,len(small_rects)):(x, y, w, h) = small_rects[j]if x < 1060 and y < 1130 and x+w > 1060 and y+h > 1130 and something_missing == 1:#特殊情况下表格矩形的提取x_rem = xy_rem = yw_rem = wh_rem = hfor j in range(0,len(small_rects)):(x, y, w, h) = small_rects[j]if x < 700 and y < 370 and x+w > 700 and y+h > 370 :request_info = (x, y, w, h)request_info_set.append(request_info)cv.rectangle(shrinkTwoTimesTranslation,(x,y),(x+w,y+h),(255,0,0),1)#性别1########### 识别 性别1 ###########if table_style == 1:x_com = 700#信息所在的大致位置y_com = 370if table_style == 2:x_com = 700y_com = 335if table_style == 3:x_com = 724y_com = 316if table_style == 4:x_com = 724y_com = 292if need_stronger == 1:#特殊情况处理x = 673y = 385w = 740 - 673h = 420 - 385get_all = 1cha_num = 3is_ABO = 0xingbie_result = ''return_change, xingbie_result_set = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)print xingbie_result_setif xingbie_result_set[0] == 2206 or xingbie_result_set[1] == 2206 or xingbie_result_set[2] == 2206:xingbie_result = '男'if xingbie_result_set[0] == 775 or xingbie_result_set[1] == 775 or xingbie_result_set[2] == 775:xingbie_result = '女'if xingbie_result == '':xingbie_result = '男'print "性别: ",xingbie_resultelse:for j in range(0,len(request_info_set)):(x, y, w, h) = request_info_set[j]if x < x_com and y < y_com and x+w > x_com and y+h > y_com :#提取信息所在的矩形框get_all = 1#对手写汉字模型的三个识别结果都进行分析cha_num = 3#字段预计最长长度is_ABO = 0#不是血型数据xingbie_result = ''#初始化#调用chinese_out.py中的chinese_o()函数进行手写汉字识别return_change, xingbie_result_set = chinese_out.chinese_o(th2_copy, shrinkTwoTimesTranslation_copy, x, y, w, h, cha_num, get_all, is_ABO)print xingbie_result_set#如果三个识别结果中有一个'男',则确定为男性if xingbie_result_set[0] == 2206 or xingbie_result_set[1] == 2206 or xingbie_result_set[2] == 2206:xingbie_result = '男'#如果三个识别结果中有一个'女',则确定为女性if xingbie_result_set[0] == 775 or xingbie_result_set[1] == 775 or xingbie_result_set[2] == 775:xingbie_result = '女'#特殊情况处理if xingbie_result == '':xingbie_result = '男'print "性别: ",xingbie_result

既然框都已经被提取出来了,那不是大功告成了吗?

并没有。因为我们还需要对所需要的位置的框进行提取,先前提取出的框又不附带标签。因此我们此处需要做两步:

首先是选择标准图像中所需的框位置的中心点作为这个框的“ID”,标准图像即先前所说的坐标(78,1581)所在的那张图片。上文代码里即选择(700,370)作为“性别”数据的所在框的大致位置。

其次是通过遍历的方式筛选出这个中心点坐标所处在的矩形框的坐标。因为框较大,因此即使校正的不够精准,这种方法一样可以将目标框精准地框出,只要标准图像中的那个大致位置不偏移得太过严重。

网上对表格框的提取的相关资料较少,尤其是Python+OpenCV的实现方面,本方法的鲁棒性可能还不够强,欢迎多多交流!

要点初见:Python+OpenCV校正并提取表格中的各个框相关推荐

  1. 通过Python的pdfplumber库提取pdf中表格数据

    文章目录 前言 一.pdfplumber库是什么? 二.安装pdfplumber库 三.查看pdfplumber库版本 四.提取pdf中表格数据 1.引入库 2.定义pdf文件路径 3.打开pdf文件 ...

  2. python批量生成word报告_python自动生成word报告 | 如何将现有的数据利用python 填入word的表格中?...

    关于python连接SQL server数据库的问题? 你把完整的连接代码贴一下 如何将现有的数据利用python 填入word的表格中? VB,VBA我会,py不会哦 DB2教程推荐,新手想学习这个 ...

  3. python调用everything批量查找表格中的文件名在磁盘中是否存在

    python调用everything批量查找表格中的文件名在磁盘中是否存在 介绍 Everything 配置 使用openpyxl读写文件 读文件 写文件 BeautifulSoup的使用 创建 be ...

  4. python操作word填表_#如何将现有的数据利用python 填入word的表格中?#

    如何让Word表格中的数据自动填入到Excel表格中 你好 在插入对象中,建立一个数据源 怎么把excel中的数据批量导入到word中的表格中 1.首先打开excel文件,随意复制文件一块区域. 2. ...

  5. 通过Python的fitz库提取pdf中的图片

    文章目录 前言 一.fitz库是什么? 二.安装fitz库 三.查看fitz库版本 四.pymupdf库是什么? 五.安装pymupdf库 六.查看pymupdf库版本 七.fitz和pymupdf是 ...

  6. python分析pdf年报 货币现金_如何用Python从大量pdf 中提取表格中的数据进行分析?...

    根据一楼答案@森林的建议 说说我的处理经验 我也是借助开源项目tabula,不得不说tabula的功能确实很强大. 我是用Python来处理数据,但是没有用tabula-py,因为表格跨列跨行等情况比 ...

  7. python对excel筛选提取文本中数字_详解利用python提取pdf文本数字

    之前也不乏介绍过关于excel的内容,日常工作应用,除了excel,pdf也是经常使用的一种,关于pdf的文本提取,下面也来详细介绍~ 说明:从pdf文件中提取其他类型的数据,如文本或图像.将说明从p ...

  8. Excel学习笔记:P24-如何用LEFT等函数提取表格中的文字资料

    文章目录 一.LEFT.RIGHT.MID.FIND.LEN函数 1.1 LEFT函数 1.2 RIGHT函数 1.3 MID函数 1.4 FIND函数 1.5 LEN函数 二.例子 2.1 截取品名 ...

  9. pyqt5在表格中添加单选框(勾选框)

    先看效果图; 下面是添加后面勾选框的代码: self.cbs = []for r in range(self.rows): # 添加勾选for c in range(self.cols):if c = ...

最新文章

  1. 仿中国比特币首页趋势图,折线图,k线图
  2. 第二章:SpringBoot与JSP间不可描述的秘密
  3. 计算机学机械制图吗,机械制图与计算机绘图(少学时·任务驱动模式)
  4. 百度搜索打不开第二页_如何查询百度关键词精准收录位置以及收录量(附代码)...
  5. 1.java的基础和数据类型
  6. ecs硬盘数据迁移_阿里云ECS新增数据盘以及迁移数据方法
  7. LeetCode —— 980. 不同路径 III(Python)
  8. AS3的Vector的初始化
  9. ADO.NET Entity Framework如何:使用实体数据模型向导(实体框架)
  10. java 同步 set_Java Collections synchronizedSet()用法及代码示例
  11. --save-dev 与 --save的区别
  12. matlab三维图渲染颜色,三维图颜色渐变
  13. DiskGenius无损调整C盘容量方法扩大c盘
  14. mysql hugepage_huge page 能给MySQL 带来性能提升吗?
  15. 20189319《网络攻防》第二周作业
  16. Maya Mental Ray焦散效果
  17. 【python数据分析实战】城市餐饮店铺选址问题(1)—— 对不同菜系进行比较,并筛选出可开店铺的餐饮类型
  18. 独家解读:下水煤长协基准价700元/吨 每月一调
  19. 更新android系统自带webview,更新android系统自带webview
  20. imgproc/src/morph.simd.hpp:756: error:(-213:The function/feature is not implemented)解决方案

热门文章

  1. 机器学习入门:第十七章 Boltzmann波尔兹曼机
  2. 2018支付宝新年到店专享红包足不出户使用
  3. 卑微的人依然可以有美丽的梦想—一段让无数人感动的视频
  4. 回调地址常见问题及修改方法
  5. gitlab使用详解
  6. linux bond设备删除,删除修改bond
  7. Qt 可见性 isHiden和isVisible的区别
  8. 精益管理有哪六大误区?
  9. JS 案例 大小写转换
  10. 利用python制作拼图_python制作拼图小游戏