
  • 1. 前言
  • 2. 极几何计算视差图
    • 2.1 StereoSGBM算法代码
    • 2.2 StereoSGBM算法思路解析
    • 2.3 个人疑惑
  • 3. GrabCut算法代码
  • 4. 分水岭算法
    • 4.1 分水岭代码
    • 4.2 部分代码详解
      • 4.2.1 对灰度化的图像gray进行阈值处理
      • 4.2.2 创建标记,并打印区域的个数(含背景)
      • 4.2.3 设置栅栏
      • 4.2.4 漫水并寻找栅栏
  • 5. 结语
1. 前言

主要针对 《OpenCV 3 计算机视觉 Python语言实现》 第4章 深度估计与分割 做的个人笔记。
本章并没有使用深度摄像头,主要分为三部分:极几何计算视差图(应用StereoSGBM算法) 、GrabCut算法 和 分水岭算法 。

2. 极几何计算视差图

2.1 StereoSGBM算法代码

import numpy as np
import cv2def update():stereo.setBlockSize(cv2.getTrackbarPos('window_size', 'disparity'))stereo.setUniquenessRatio(cv2.getTrackbarPos('uniquenessRatio', 'disparity'))stereo.setSpeckleWindowSize(cv2.getTrackbarPos('speckleWindowSize', 'disparity'))stereo.setSpeckleRange(cv2.getTrackbarPos('speckleRange', 'disparity'))stereo.setDisp12MaxDiff(cv2.getTrackbarPos('disp12MaxDiff', 'disparity'))print('computing disparity...')disp = stereo.compute(imgL, imgR).astype(np.float32) / 16.0cv2.imshow('left', imgL)cv2.imshow('right', imgR)cv2.imshow('disp', disp)cv2.imshow('disp', (disp - min_disp) / num_disp)cv2.imshow('disparity', (disp - min_disp) / num_disp)if __name__ == "__main__":window_size = 3min_disp = 16num_disp = 192 - min_dispblockSize = window_sizeuniquenessRatio = 1speckleRange = 12speckleWindowSize = 3disp12MaxDiff = 200P1 = 600P2 = 2400imgL = cv2.imread('depth1.jpg')imgR = cv2.imread('depth2.jpg')imgL = cv2.resize(imgL, (800, 450))imgR = cv2.resize(imgR, (800, 450))print(imgL.shape)print(imgR.shape)cv2.namedWindow('disparity')cv2.createTrackbar('speckleRange', 'disparity', speckleRange, 50, update)cv2.createTrackbar('window_size', 'disparity', window_size, 10, update)cv2.createTrackbar('speckleWindowSize', 'disparity', speckleWindowSize, 200, update)cv2.createTrackbar('uniquenessRatio', 'disparity', uniquenessRatio, 50, update)cv2.createTrackbar('disp12MaxDiff', 'disparity', disp12MaxDiff, 250, update)'''cv2.createStereoSGBM(minDisparity, numDisparities, blockSize[, P1[, P2[, disp12MaxDiff[, preFilterCap[, uniquenessRatio[, speckleWindowSize[, speckleRange[, mode]]]]]]]]) → retvalParameters: minDisparity – Minimum possible disparity value. Normally, it is zero but sometimes rectification algorithms can shift images, so this parameter needs to be adjusted accordingly.numDisparities – Maximum disparity minus minimum disparity. The value is always greater than zero. In the current implementation, this parameter must be divisible by 16.blockSize – Matched block size. It must be an odd number >=1 . Normally, it should be somewhere in the 3..11 range.P1 – The first parameter controlling the disparity smoothness. See below.P2 – The second parameter controlling the disparity smoothness. The larger the values are, the smoother the disparity is. P1 is the penalty on the disparity change by plus or minus 1 between neighbor pixels. P2 is the penalty on the disparity change by more than 1 between neighbor pixels. The algorithm requires P2 > P1 . See stereo_match.cpp sample where some reasonably good P1 and P2 values are shown (like 8*number_of_image_channels*SADWindowSize*SADWindowSize and 32*number_of_image_channels*SADWindowSize*SADWindowSize , respectively).disp12MaxDiff – Maximum allowed difference (in integer pixel units) in the left-right disparity check. Set it to a non-positive value to disable the check.preFilterCap – Truncation value for the prefiltered image pixels. The algorithm first computes x-derivative at each pixel and clips its value by [-preFilterCap, preFilterCap] interval. The result values are passed to the Birchfield-Tomasi pixel cost function.uniquenessRatio – Margin in percentage by which the best (minimum) computed cost function value should “win” the second best value to consider the found match correct. Normally, a value within the 5-15 range is good enough.speckleWindowSize – Maximum size of smooth disparity regions to consider their noise speckles and invalidate. Set it to 0 to disable speckle filtering. Otherwise, set it somewhere in the 50-200 range.speckleRange – Maximum disparity variation within each connected component. If you do speckle filtering, set the parameter to a positive value, it will be implicitly multiplied by 16. Normally, 1 or 2 is good enough.mode – Set it to StereoSGBM::MODE_HH to run the full-scale two-pass dynamic programming algorithm. It will consume O(W*H*numDisparities) bytes, which is large for 640x480 stereo and huge for HD-size pictures. By default, it is set to false .    '''stereo = cv2.StereoSGBM_create(minDisparity=min_disp,numDisparities=num_disp,blockSize=window_size,uniquenessRatio=uniquenessRatio,speckleRange=speckleRange,speckleWindowSize=speckleWindowSize,disp12MaxDiff=disp12MaxDiff,P1=P1,P2=P2)update()cv2.waitKey(0)cv2.destroyAllWindows()

2.2 StereoSGBM算法思路解析

1. 导入numpy 模块和 cv2 模块
2. 主代码:
创建一个StereoSGBM实例 stereo
3. update函数:
将滚动条返回的值传给实例 stereo

2.3 个人疑惑

  1. 计算出disp为何要除以16?
    参考链接1. 出于精度需要,所有的视差在输出时都扩大了16倍(2^4)。
  2. 为什么最后要对disparity归一化?

3. GrabCut算法代码

import numpy as np
import cv2
from matplotlib import pyplot as plt# 读入图片,并创建对应大小的掩膜
img = cv2.imread('statue_small.jpg')
mask = np.zeros(img.shape[:2], np.uint8)
print(img.shape)# 创建背景、前景模型
bgdModel = np.zeros((1, 65), np.float64)
fgdModel = np.zeros((1, 65), np.float64)# 给定分割区域,并使用GrabCut分割
rect = (100, 50, 421, 378)
cv2.grabCut(img, mask, rect, bgdModel, fgdModel, 10, cv2.GC_INIT_WITH_RECT)# 判读掩膜,将背景置0,否则置1
mask2 = np.where((mask == 2) | (mask == 0), 0, 1).astype('uint8')
# 将掩膜变成三维与img与,得到结果
img = img*mask2[:, :, np.newaxis]# 输出结果
plt.subplot(121), plt.imshow(cv2.cvtColor(cv2.imread('statue_small.jpg'), cv2.COLOR_BGR2RGB))
plt.title("original"), plt.xticks([]), plt.yticks([])
plt.subplot(122), plt.imshow(img)
plt.title("grabcut"), plt.xticks([]), plt.yticks([])


4. 分水岭算法

4.1 分水岭代码


import numpy as np
import cv2
from matplotlib import pyplot as plt# 读入照片并灰度化
img = cv2.imread('basil.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)# 使用反二进制和OTSU进行阈值处理,大于阈值为黑,小于阈值为白
_, ret = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)# 消除噪声
kernel = np.ones((3, 3), np.uint8)
opening = cv2.morphologyEx(ret, cv2.MORPH_OPEN, kernel, iterations = 2)# 确定背景区域 sure background area
sure_bg = cv2.dilate(opening, kernel, iterations=3)# 寻找确定的前景区域 Finding sure foreground area
dist_transform = cv2.distanceTransform(opening, cv2.DIST_L2, 5)
_, sure_fg = cv2.threshold(dist_transform, 0.7*dist_transform.max(), 255, 0)# 寻找不确定区域 Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg, sure_fg)# 显示不同的区域
plt.subplot(2, 2, 1), plt.title('ret')
plt.imshow(ret, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 2), plt.title('sure_bg')
plt.imshow(sure_bg, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 3), plt.title('sure_fg')
plt.imshow(sure_fg, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 4), plt.title('unknown')
plt.imshow(unknown, cmap='gray'), plt.axis('off')
plt.show()# 确定标记,并读取区域的个数(含背景) Marker labelling
num, markers = cv2.connectedComponents(sure_fg)
print(num)# 将markers中的元素乘30,方便观察
markers2 = [item * 30 for item in markers]
markers2 = np.array(markers2, dtype=np.uint8)# 所有背景区域+1 Add one to all labels so that sure background is not 0, but 1
markers = markers + 1
markers3 = [item * 30 for item in markers]
markers3 = np.array(markers3, dtype=np.uint8)# 不确定区域置0 Now, mark the region of unknown with zero
markers[unknown == 255] = 0
markers3 = [item * 30 for item in markers]
markers3 = np.array(markers3, dtype=np.uint8)# 使用分水岭算法执行基于标记的图像分割
markers = cv2.watershed(img, markers)
markers4 = np.array(markers, dtype=np.uint8)# 汇总对比不同时期的markers
plt.subplot(2, 2, 1)
plt.imshow(markers2, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 2)
plt.imshow(markers3, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 3)
plt.imshow(markers4, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 4)
plt.imshow(markers, cmap='gray'), plt.axis('off')
plt.show()# 区域之间的边界处为-1,绘制红色
img[markers == -1] = [255, 0, 0]# 显示图片

4.2 部分代码详解


4.2.1 对灰度化的图像gray进行阈值处理



4.2.2 创建标记,并打印区域的个数(含背景)

创建标记,然后在里面标记区域,我们确定的区域(前景或者背景)用不同的正整数标记出来,我们不确认的区域保持0,我们可以用 cv2.connectedComponents() 实现。其将图像背景标成0,其他目标用从1开始(1,2,3,……)的整数标记。

#确定标记,并读取区域的个数(含背景) Marker labelling
num, markers = cv2.connectedComponents(sure_fg)


4.2.3 设置栅栏



如果背景标记为 0 不改变,那分水岭算法就会把它当成未知区域了。

markers = markers + 1
markers[unknown == 255] = 0

4.2.4 漫水并寻找栅栏


5. 结语

本文主要针对 《OpenCV 3 计算机视觉 Python语言实现》 第4章 深度估计与分割 的内容,捋一捋思路后针对自己的理解,做了一份笔记。




  1. 双目匹配与视差计算
  2. OpenCV3-Python深度估计—基于图像
  3. cv::StereoSGBM Class Reference
    Open CV 文档中对StereoSGBM的定义
  4. Opencv中convertTo函数
  5. OpenCV(EmguCV)2.1新特性介绍之图像分割GrabCut(GrabCut Of OpenCV 2.1)
  6. Structural Analysis and Shape Descriptors
    Open CV 文档中对connectedComponents的定义
  7. OpenCV—分水岭算法

