HOG特征提取算法解析

1.HOG特征原理

HOG的核心思想是所检测的局部物体外形能够被光强梯度或边缘方向的分布所描述。

通过将整幅图像分割成小的连接区域（称为cells），每个cell生成一个方向梯度直方图或者cell中pixel的边缘方向，这些直方图的组合可表示出（所检测目标的目标）描述子。为改善准确率，局部直方图可以通过计算图像中一个较大区域(称为block)的光强作为measure被对比标准化，然后用这个值(measure)归一化这个block中的所有cells.这个归一化过程完成了更好的照射/阴影不变性。
hog特征优点：
(1).几何和光学转化不变性，尤其适合人的检测。
(2).Hog没有旋转和尺度不变性。(所以计算量小)
那么行人检测中怎么应用HOG呢？
对于解决Scale-invariant的问题：将图片进行不同尺度的缩放，就相当于对模板进行不同尺度scale的缩放。
对于解决Rotation-invariant 的问题：建立不同方向的模版（一般取15*7的）进行匹配。

2.HOG实现原理

对于一个image，在一个大小为winSIZE的检测窗口中，用一个大小为blockSIZE的block来扫描整个检测窗口，每个block又分为四个cell，每次block扫描都输出四个cell的梯度信息然后保存到特征向量中。
这三者的示意图如下：[2]
1. 窗口大小 winSize
2. 块大小 blockSize
3. 胞元大小 cellSize
通俗的讲：
HOG特征提取方法就是将一个image：
1. 灰度化（将图像看做一个x,y,z（灰度）的三维图像）
2. 划分成小cells（2*2）
3. 计算每个cell中每个pixel的gradient（即orientation）
4. 统计每个cell的梯度直方图（不同梯度的个数），即可形成每个cell的descriptor
对opencv中的hog源码中一些固定参数来个简单说明：
　　　检测窗口大小为128*64;
　　　Block大小为16*16；
　　　Cell大小为8*8；
　　　Block在检测窗口中上下移动尺寸为8*8；
　　　1个cell的梯度直方图化成9个bin；
　　　滑动窗口在检测图片中滑动的尺寸为8*8；
代码中的一个hog描述子是针对一个检测窗口而言的，所以一个检测窗口共有105=((128-16)/8+1)*((64-16)/8+1)个block；一个block中有4个cell，而一个cell的hog描述子向量的长度为9；所以检测窗口的hog向量长度=3780=105*4*9维。每次block扫描都输出四个cell的梯度信息然后保存到特征向量中，知道扫描完最后特征向量输出3780维的向量。

3.HOG可视化理解[3]

//hog可视化代码
#include <opencv2/opencv.hpp>
#include <cstdio>
#include <cstdlib>
#include <Windows.h>  using namespace std;
using namespace cv;// HOGDescriptor visual_imagealizer
// adapted for arbitrary size of feature sets and training images
Mat get_hogdescriptor_visual_image(Mat& origImg,vector<float>& descriptorValues,//hog特征向量Size winSize,//图片窗口大小Size cellSize,             int scaleFactor,//缩放背景图像的比例double viz_factor)//缩放hog特征的线长比例
{   Mat visual_image;//最后可视化的图像大小resize(origImg, visual_image, Size(origImg.cols*scaleFactor, origImg.rows*scaleFactor));int gradientBinSize = 9;// dividing 180° into 9 bins, how large (in rad) is one bin?float radRangeForOneBin = 3.14/(float)gradientBinSize; //pi=3.14对应180°// prepare data structure: 9 orientation / gradient strenghts for each cellint cells_in_x_dir = winSize.width / cellSize.width;//x方向上的cell个数int cells_in_y_dir = winSize.height / cellSize.height;//y方向上的cell个数int totalnrofcells = cells_in_x_dir * cells_in_y_dir;//cell的总个数//注意此处三维数组的定义格式//int ***b;//int a[2][3][4];//int (*b)[3][4] = a;//gradientStrengths[cells_in_y_dir][cells_in_x_dir][9]//上面一个是来记录总的梯度信息，下面一个是记录这个cell被记录了多少次float*** gradientStrengths = new float**[cells_in_y_dir];int** cellUpdateCounter   = new int*[cells_in_y_dir];for (int y=0; y<cells_in_y_dir; y++){gradientStrengths[y] = new float*[cells_in_x_dir];cellUpdateCounter[y] = new int[cells_in_x_dir];for (int x=0; x<cells_in_x_dir; x++){//将每一点的梯度值分为9份gradientStrengths[y][x] = new float[gradientBinSize];cellUpdateCounter[y][x] = 0;for (int bin=0; bin<gradientBinSize; bin++)gradientStrengths[y][x][bin] = 0.0;//把每个cell的9个bin对应的梯度强度都初始化为0}}// nr of blocks = nr of cells - 1// since there is a new block on each cell (overlapping blocks!) but the last one//相当于blockstride = (8,8)int blocks_in_x_dir = cells_in_x_dir - 1;//这是代表是block也是在扫描的吗？最后一个不用组成一个block了，这里block横向纵向都是cell的俩倍int blocks_in_y_dir = cells_in_y_dir - 1;// compute gradient strengths per cellint descriptorDataIdx = 0;int cellx = 0;int celly = 0;for (int blockx=0; blockx<blocks_in_x_dir; blockx++){for (int blocky=0; blocky<blocks_in_y_dir; blocky++)            {// 4 cells per block ...for (int cellNr=0; cellNr<4; cellNr++){// compute corresponding cell nr//分为0，1，2，3四个格子，第0个即与block的坐标相同，其余相应加1int cellx = blockx;int celly = blocky;if (cellNr==1) celly++;if (cellNr==2) cellx++;if (cellNr==3){cellx++;celly++;}//for (int bin=0; bin<gradientBinSize; bin++){float gradientStrength = descriptorValues[ descriptorDataIdx ];descriptorDataIdx++;//初始每个cell里面的都为0，blcok每扫描一次，都会有一个新的特征向量记录下来，这个是把每次重复计算cell时，得到的梯度特征信息都叠加到一起，然后再用于后面的显示gradientStrengths[celly][cellx][bin] += gradientStrength;//因为C是按行存储} // for (all bins)// note: overlapping blocks lead to multiple updates of this sum!// we therefore keep track how often a cell was updated,// to compute average gradient strengthscellUpdateCounter[celly][cellx]++;//由于block之间有重叠，所以要记录哪些cell被多次计算了} // for (all cells)} // for (all block x pos)} // for (all block y pos)// compute average gradient strengthsfor (int celly=0; celly<cells_in_y_dir; celly++){for (int cellx=0; cellx<cells_in_x_dir; cellx++){float NrUpdatesForThisCell = (float)cellUpdateCounter[celly][cellx];// compute average gradient strenghts for each gradient bin directionfor (int bin=0; bin<gradientBinSize; bin++){//计算平均梯度信息gradientStrengths[celly][cellx][bin] /= NrUpdatesForThisCell;}}}cout << "winSize = " << winSize << endl;cout << "cellSize = " << cellSize << endl;cout << "blockSize = " << cellSize*2<< endl;cout << "blockNum = " << blocks_in_x_dir<<"×"<<blocks_in_y_dir << endl;cout << "descriptorDataIdx = " << descriptorDataIdx << endl;// draw cellsfor (int celly=0; celly<cells_in_y_dir; celly++){for (int cellx=0; cellx<cells_in_x_dir; cellx++){int drawX = cellx * cellSize.width;int drawY = celly * cellSize.height;int mx = drawX + cellSize.width/2;int my = drawY + cellSize.height/2;rectangle(visual_image,Point(drawX*scaleFactor,drawY*scaleFactor),Point((drawX+cellSize.width)*scaleFactor,(drawY+cellSize.height)*scaleFactor),         CV_RGB(0,0,0),//cell框线的颜色1);// draw in each cell all 9 gradient strengthsfor (int bin=0; bin<gradientBinSize; bin++){float currentGradStrength = gradientStrengths[celly][cellx][bin];// no line to draw?if (currentGradStrength==0)continue;float currRad = bin * radRangeForOneBin + radRangeForOneBin/2;//取每个bin里的中间值，如10°,30°,...,170°.float dirVecX = cos( currRad );float dirVecY = sin( currRad );float maxVecLen = cellSize.width/2;float scale = viz_factor; // just a visual_imagealization scale,// to see the lines better// compute line coordinatesfloat x1 = mx - dirVecX * currentGradStrength * maxVecLen * scale;float y1 = my - dirVecY * currentGradStrength * maxVecLen * scale;float x2 = mx + dirVecX * currentGradStrength * maxVecLen * scale;float y2 = my + dirVecY * currentGradStrength * maxVecLen * scale;// draw gradient visual_imagealizationline(visual_image,Point(x1*scaleFactor,y1*scaleFactor),Point(x2*scaleFactor,y2*scaleFactor),CV_RGB(255,255,255),//HOG可视化的cell的颜色1);} // for (all bins)} // for (cellx)} // for (celly)// don't forget to free memory allocated by helper data structures!for (int y=0; y<cells_in_y_dir; y++){for (int x=0; x<cells_in_x_dir; x++){delete[] gradientStrengths[y][x];            }delete[] gradientStrengths[y];delete[] cellUpdateCounter[y];}delete[] gradientStrengths;delete[] cellUpdateCounter;return visual_image;//返回最终的HOG可视化图像}int main()
{HOGDescriptor hog;//使用的是默认的hog参数/*HOGDescriptor(Size win_size=Size(64, 128), Size block_size=Size(16, 16), Size block_stride=Size(8, 8), Size cell_size=Size(8, 8), int nbins=9, double win_sigma=DEFAULT_WIN_SIGMA(DEFAULT_WIN_SIGMA=-1), double threshold_L2hys=0.2, bool gamma_correction=true, int nlevels=DEFAULT_NLEVELS)Parameters: win_size – Detection window size. Align to block size and block stride.block_size – Block size in pixels. Align to cell size. Only (16,16) is supported for now.block_stride – Block stride. It must be a multiple of cell size.cell_size – Cell size. Only (8, 8) is supported for now.nbins – Number of bins. Only 9 bins per cell are supported for now.win_sigma – Gaussian smoothing window parameter.threshold_L2hys – L2-Hys normalization method shrinkage.gamma_correction – Flag to specify whether the gamma correction preprocessing is required or not.nlevels – Maximum number of detection window increases.*///对于128*80的图片，blockstride = 8,15*9的block，2*2*9*15*9 = 4860int width = 80;int height = 128;hog.winSize=Size(width,height);vector<float> des;//HOG特征向量Mat src = imread("objimg.jpg");Mat dst ;resize(src,dst,Size(width,height));//规范图像尺寸//即一幅图一个wimSize全扫描完了imshow("src",src);hog.compute(dst,des);//计算hog特征Mat background = Mat::zeros(Size(width,height),CV_8UC1);//设置黑色背景图，因为要用白色绘制hog特征Mat d = get_hogdescriptor_visual_image(background,des,hog.winSize,hog.cellSize,3,2.5);imshow("dst",d);imwrite("hogvisualize.jpg",d);waitKey();return 0;
}

原图

hog特征图

Histograms of oriented gradients for human detection原文翻译

HOG特征提取算法解析相关推荐

HOG特征提取算法的过程
https://www.cnblogs.com/zhehan54/p/6723956.html 方向梯度直方图(Histogram of Oriented Gradient, HOG)特征是一种在计算 ...
CV17 HOG特征提取算法
文章目录 Hog流程 Hog特征原理 1.梯度方向和模的计算 2.计算cell的梯度直方图 3.block归一化理解直方图 1.三角形内部 2.靠近水平边缘 3.靠近垂直边缘 4.靠近对角线边缘 H ...
HOG 特征提取算法（实践篇）
Histograms of Oriented Gradients (HOG) 正如在 ORB 算法中看到的,我们可以使用图像中的关键点进行匹配,以检测图像中的对象.当想要检测具有许多一致的内部特性且不 ...
基于matlab的图像HOG特征提取算法仿真
目录 1.算法概述 2.仿真效果 3.MATLAB仿真源码 1.算法概述方向梯度直方图(Histogram of Oriented Gradient, HOG)特征是一种在计算机视觉和图像处理中用来 ...
HOG特征提取算法原理
1.算法基本流程在一幅图像中,方向梯度直方图(Histogram of Oriented Gradient, HOG)能够很好地描述局部目标区域的特征,是一种常用的特征提取方法,HOG+SVM在行人 ...
图像处理之HOG特征提取
图像处理之HOG特征提取算法 HOG(Histogram of Oriented Gradient)特征在对象识别与模式匹配中是一种常见的特征提取算法,是基于本地像素块进行特征直方图提取的一种算法,对 ...
常见的几种图像特征提取算法
常见的几种图像特征提取算法 1. LBP算法(Local Binary Patterns,局部二值模式) 2.HOG特征提取算法(Histogram of Oriented Gradient) 3.S ...
基于BP神经网络+HOG特征提取的视频中车辆车牌识别算法仿真
步骤一:定位和跟踪这里,没有直接采用之前的方案,是因为在设计的时候,发现直接采用颜色等直接特征提取然后进行二值化处理的方法,如果视频中出现颜色类似的区域,则很有可能错误的定位,例如在公交车中车牌区域 ...
ML之Hog_HammingDistance：基于Hog特征提取“RGB”图像的768个值的单向vector利用汉明距离算法进行判别
ML之Hog_HammingDistance:基于Hog特征提取"RGB"图像的768个值的单向vector利用汉明距离算法进行判别目录输出结果代码实现相关文章 ML之相似 ...
python hog特征提取_HOG图像特征提取算法
HOG简介 HOG全称:方向梯度直方图(Histogram of Oriented Gradient),发表于2005年的CVPR,是一种图像特征提取算法,和SVM分类器结合应用于行人检测领域.HOG ...

HOG特征提取算法解析

1.HOG特征原理

2.HOG实现原理

3.HOG可视化理解[3]

Histograms of oriented gradients for human detection原文翻译

HOG特征提取算法解析相关推荐

最新文章

热门文章