使用opencv训练目标检测模型基于cascade模型

基于Haar特征的cascade分类器(classifiers) 是Paul Viola和 Michael Jone在2001年，论文”Rapid Object Detection using a Boosted Cascade of Simple Features”中提出的一种有效的物品检测(object detect)方法。它是一种机器学习方法，通过许多正负样例中训练得到cascade方程，然后将其应用于其他图片。

OpenCV是一个了不起的、灵活的、可扩展的平台，用于在计算机视觉空间中构建机器学习模型。下面的教程解释了如何从头构建haar cascade( object detection 目标检测)，并在您的应用程序中使用它。模型的详细过程和视频演示可以参考如下地址：

demo:

https://www.youtube.com/watch?v=erSePe_KtNU

https://www.youtube.com/watch?v=qQywEw6g9rI .

下面两个资料是解释和说明cascade算法的绝佳资料：

https://docs.opencv.org/3.2.0/dc/d88/tutorial_traincascade.html
https://docs.opencv.org/2.4/doc/user_guide/ug_traincascade.html

准备正样本：

阳性示例必须使用Opencv_annotate工具手动生成。但是，首先我们需要得到一组图像这些图像包含我们需要训练级联器来检测的目标。在我们的例子中，它是可口可乐的标志。可口可乐的标识通常出现在广告中。所以使用谷歌图像搜索“可口可乐广告”，并一个一个下载阳性样本。或者更好的方法是在https://github.com/hardikvasa/google-images-download下载“谷歌Images download”python包。这是一个用于创建数据集的很棒的包。

googleimagesdownload.exe -k "coca cola" -sk advertisements -f png -o Pos -s medium

现在我们有了阳性样本。我们需要从样本中提取对象(在本例中是可乐标识)。它是cascade将被训练来检测和识别的对象。最好的方法(至少是我的方法)是使用opencv_annotate应用程序遍历每个示例并标记对象的矩形区域以创建注释文件。下面是运行该应用程序的powershell脚本

$datafile = 'info/info_pos_round.data'
$opencv_annotations = 'C:\Users\rithanya\Documents\Python\opencv-master\Build\opencv\build\x64\vc15\bin\opencv_annotation.exe'
$folderpath = './Source'

& $opencv_annotations --annotations=$datafile --images=$folderpath```

Once we have the annotation file, Run the following python script to extract the objects (logo in our case) and resize them to the same size. It seems the smaller the size of the object the better it is interms of training time and accuracy. Also try to get objects from as many image image samples as possible. I extracted 58 logo images to train the cascasde in my project

```python
def ExtractObject(datafile = "info_nike_demo.data", # annotation file
pathtowrite = "./Train/"):

#open datafile
f = open(datafile)
content = f.read()
i = 1
for l in content.split('\n'):
words = l.split()
if(len(words) >= 6): # coz sometimes the images have no region.
img_path = ' '.join(words[:-5]) # path the positive sample file
img_path = img_path.replace('\\','/') # replace back-slash if you are window user
img = cv2.imread(img_path,0) # read the read the sample using OpenCV
logo = [int(w) for w in words[-4:]] #extract the logo
x,y,w,h = logo
logograb = img[y:y+h, x:x+w]
# keep the size small to keep the training time short
img = cv2.resize(logograb, (60,20), interpolation = cv2.INTER_AREA)
cv2.imwrite(pathtowrite + str(i) + '.jpg', img)
i = i + 1

需要把文件整理成.vec的形式这是opencv_traincascade程序的要求。

训练级联的opencv_traincascade应用程序以.vec文件的形式接收正例图像。我们可以使用opencv_createsexamples应用程序来创建.vec文件。但在此之前，我们需要为图像构建另一个注释文件。这是因为createsexamples应用程序将这个注释文件作为输入来创建.vec文件。由于logo包含了上面步骤中提取的全部图像文件，因此注释文件内容将如下所示：

Source/logo_orig/1.jpg 1 0 0 60 20
Source/logo_orig/10.jpg 1 0 0 60 20

注释文件脚本：

def create_infodata(imgfolder = 'Source/logo_orig'):for img in os.listdir(imgfolder):line = imgfolder + "/" + img + " 1 0 0 60 20\n"with open('info_pos_orig.data','a') as f:f.write(line)

注释或者标注文件创建完毕之后，使用如下命令构建.vec文件：

$opencv_createsamples = 'C:\Users\rithanya\Documents\Python\opencv-master\Build\opencv\build\x64\vc15\bin\opencv_createsamples.exe'
& $opencv_createsamples -info info_pos_orig.data -num 58 -w 60 -h 20 -vec pos_orig.vec -show

准备负样本：

cascade的准确性取决于负样本的数量和多样性。好的负样本是那些在我们将要检测物体的图像或视频的背景中。在我们的项目中，好的负面文件是另一种类似可口可乐的软饮料的广告(就是百事可乐)。所以我运行了下面的脚本从谷歌图像搜索下载了100个百事可乐的广告。这里要做的一件重要的事情是预览这些负面图片，并删除那些包含可口可乐的图像。负样本的图像不应该偶然有任何正样本的信息存在。

googleimagesdownload.exe -k "pepsi" -sk advertisements -f png -o Neg -s medium

所以我使用Scikit-Learn的PatchExtractor模块创建了大约6000个100 × 100大小的补丁，作为负样本来训练cascade

from sklearn.feature_extraction.image import PatchExtractor
from skimage import data, transform

def extract_patches(img, N, scale=1.0, patch_size=(100,100)):
extracted_patch_size = tuple((scale * np.array(patch_size)).astype(int))
extractor = PatchExtractor(patch_size=extracted_patch_size,
max_patches=N, random_state=0)
patches = extractor.transform(img[np.newaxis])
if scale != 1:
patches = np.array([transform.resize(patch, patch_size)
for patch in patches])
return patches

# 如下的代码从下载下来的图像抽去了75个图像块，并把它们resize到100*100

images = []
rootfolder = 'Neg'
for imgfolder in os.listdir(rootfolder): #iterate thru each of the 5 celeb folders
if(imgfolder == 'pepsi advertisements'):
for filename in os.listdir(rootfolder + '/' + imgfolder):# iterate thru each image in a celeb folder
filename = rootfolder + '/' + imgfolder + '/' + filename # build the path to the image file
if(filename.endswith('.jpg')):
img = cv2.imread(filename)
if(img != None):
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
images.append(img)

negative_patches = np.vstack([extract_patches(im, 75, scale)
for im in images for scale in [1.0]])

创建bg.txt文件

bg.txt文件每一行列出了一个大图的所有负样本，bg.txt文件样式如下，

NegFromAds/Patches/1.jpg
NegFromAds/Patches/2.jpg

创建bg.txt的脚本如下：

def create_bgtxt(imgfolder = 'Neg/Patches'):
for img in os.listdir(imgfolder):
line = imgfolder + "/" + img + "\n"
with open('bg.txt','a') as f:
f.write(line)

训练模型：

我们已经构建了模型需要的正样本和负样本，使用如下样本训练cascade模型，如果有6500个100*100的负样本以及58个60*20大小的正样本，在普通笔记本上的训练时间大约需要30分钟；

$opencv_traincascade = 'C:\Users\rithanya\Documents\Python\opencv-master\Build\opencv\build\x64\vc15\bin\opencv_traincascade.exe'
& $opencv_traincascade -data cascade -vec Pos.vec -bg Negative.txt -numPos 11 -numNeg 12 -numStages 10 -w 20 -h 20

测试模型：

Test the cascade using the following python script. It turns out that the parameters for the detectMultiScale is as important as the cascade itself to optimize the detection accuracy. To find the right balance between selectivity and sensitivity. Here is a very good explanation of the parameters of the detectMultiScale function

使用如下脚本测试cascade模型。在模型参数中detectMultiScale是非常重要的，甚至超越了cascade模型本身对于该任务的重要性。如何使用detectMultiScale参数来平衡选择性和敏感性是需要花时间去研究的：关于这个问题Stack Overflow有一篇很好的参考资料：

https://stackoverflow.com/questions/20801015/recommended-values-for-opencv-detectmultiscale-parameters

cokelogo_cascade = "C:/Users/rithanya/Documents/Python/Industrial_Safety/coke/cascade.xml"
cokecascade = cv2.CascadeClassifier(cokelogo_cascade)

#utility function to apply different cascade function on the images at difference scaleFactor
def detect(faceCascade, gray_, scaleFactor_ = 1.1, minNeighbors = 5):
faces = faceCascade.detectMultiScale(gray_,
scaleFactor= scaleFactor_,
minNeighbors=5,
minSize= (30,30), #(60, 20),
flags = cv2.CASCADE_SCALE_IMAGE
)
return faces

def DetectAndShow(imgfolder = 'NegFromAds/coca cola advertisements/'):
cokelogo_cascade = "./cascade4/cokelogoorigfullds.xml"
cokecascade = cv2.CascadeClassifier(cokelogo_cascade)
for i in os.listdir(imgfolder):
filepath = imgfolder + i
img = cv2.imread(filepath)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cokelogos = detect(cokecascade, gray, 1.25, 6)
for (x, y, w, h) in cokelogos:
cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2)
cv2.imshow('positive samples',img)
k = 0xFF & cv2.waitKey(0)
if k == 27: # q to exit
break

参考：使用Haar Cascade 进行人脸识别

参考：ckarthic/ObjectDetectionOpenCV

参考：Stack Overflow

参考：opencv+cascade