If you’ve been paying attention to my Twitter account lately, you’ve probably noticed one or twoteasers of what I’ve been working on — a Python framework/package to rapidly construct object detectors using Histogram of Oriented Gradients and Linear Support Vector Machines.

Honestly, I really can’t stand using the Haar cascade classifiers provided by OpenCV (i.e. the Viola-Jones detectors) — and hence why I’m working on my own suite of classifiers. While cascade methods are extremely fast, they leave much to be desired. If you’ve ever used OpenCV to detect faces you’ll know exactly what I’m talking about.

Figure 1: Example of falsely detecting a face in an image. This is a common problem when using cv2.detectMultiScale.

In order to detect faces/humans/objects/whatever in OpenCV (and remove the false positives), you’ll spend a lot of time tuning the  cv2.detectMultiScale  parameters. And again, there is no guarantee that the exact same parameters will work from image-to-image. This makes batch-processing large datasets for face detection a tedious task since you’ll be very concerned with either (1) falsely detecting faces or (2) missing faces entirely, simply due to poor parameter choices on a per image basis.

There is also the problem that the Viola-Jones detectors are nearing 15 years old. If this detector were a nice bottle of Cabernet Sauvignon I might be pretty stoked right now. But the field has advanced substantially since then.  Back in 2001 the Viola-Jones detectors were state-of-the-art and they were certainly a huge motivating force behind the incredible new advances we have in object detection today.

Now, the Viola-Jones detector isn’t our only choice for object detection. We have object detection using keypoints, local invariant descriptors, and bag-of-visual-words models. We have Histogram of Oriented Gradients. We have deformable parts models. Exemplar models. And we are now utilizing Deep Learning with pyramids to recognize objects at different scales!

All that said, even though the Histogram of Oriented Gradients descriptor for object recognition is nearly a decade old, it is still heavily used today — and with fantastic results. The Histogram of Oriented Gradients method suggested by Dalal and Triggs in their seminal 2005 paper, Histogram of Oriented Gradients for Human Detection demonstrated that the Histogram of Oriented Gradients (HOG) image descriptor and a Linear Support Vector Machine (SVM) could be used to train highly accurate object classifiers — or in their particular study, human detectors.

Histogram of Oriented Gradients and Object Detection

I’m not going to review the entire detailed process of training an object detector using Histogram of Oriented Gradients (yet), simply because each step can be fairly detailed. But I wanted to take a minute and detail the general algorithm for training an object detector using Histogram of Oriented Gradients. It goes a little something like this:

Step 1:

Sample P positive samples from your training data of the object(s) you want to detect and extract HOG descriptors from these samples.

Step 2:

Sample N negative samples from a negative training set that does not contain any of the objects you want to detect and extract HOG descriptors from these samples as well. In practice N >> P.

Step 3:

Train a Linear Support Vector Machine on your positive and negative samples.

Step 4:

Figure 2: Example of the sliding a window approach, where we slide a window from left-to-right and top-to-bottom. Note: Only a single scale is sown. In practice this window would be applied to multiple scales of the image.

Apply hard-negative mining. For each image and each possible scale of each image in your negative training set, apply the sliding window technique and slide your window across the image. At each window compute your HOG descriptors and apply your classifier. If your classifier (incorrectly) classifies a given window as an object (and it will, there will absolutely be false-positives), record the feature vector associated with the false-positive patch along with the probability of the classification. This approach is called hard-negative mining.

Step 5:

Take the false-positive samples found during the hard-negative mining stage, sort them by their confidence (i.e. probability) and re-train your classifier using these hard-negative samples.(Note: You can iteratively apply steps 4-5, but in practice one stage of hard-negative mining usually [not not always] tends to be enough. The gains in accuracy on subsequent runs of hard-negative mining tend to be minimal.)

Step 6:

Your classifier is now trained and can be applied to your test dataset. Again, just like in Step 4, for each image in your test set, and for each scale of the image, apply the sliding window technique. At each window extract HOG descriptors and apply your classifier. If your classifier detects an object with sufficiently large probability, record the bounding box of the window. After you have finished scanning the image, apply non-maximum suppression to remove redundant and overlapping bounding boxes.

These are the bare minimum steps required, but by using this 6-step process you can train and build object detection classifiers of your own! Extensions to this approach include a deformable parts model and Exemplar SVMs, where you train a classifier for each positive instance rather than a collection of them.

However, if you’ve ever worked with object detection in images you’ve likely ran into the problem of detecting multiple bounding boxes around the object you want to detect in the image.

Here’s an example of this overlapping bounding box problem:

Figure 3: (Left) Detecting multiple overlapping bounding boxes around the face we want to detect. (Right)Applying non-maximum suppression to remove the redundant bounding boxes.

Notice on the left we have 6 overlapping bounding boxes that have correctly detected Audrey Hepburn’s face. However, these 6 bounding boxes all refer to the same face — we need a method to suppress the 5 smallest bounding boxes in the region, keeping only the largest one, as seen on the right.

This is a common problem, no matter if you are using the Viola-Jones based method or following the Dalal-Triggs paper.

There are multiple ways to remedy this problem. Triggs et al. suggests to use the Mean-Shift algorithm to detect multiple modes in the bounding box space by utilizing the (x, y) coordinates of the bounding box as well as the logarithm of the current scale of the image.

I’ve personally tried this method and wasn’t satisfied with the results. Instead, you’re much better off relying on a strong classifier with higher accuracy (meaning there are very few false positives) and then applying non-maximum suppression to the bounding boxes.

I spent some time looking for a good non-maximum suppression (sometimes called non-maxima suppression) implementation in Python. When I couldn’t find one, I chatted with my friend Dr. Tomasz Malisiewicz, who has spent his entire career working with object detector algorithms and the HOG descriptor. There is literally no one that I know who has more experience in this area than Tomasz. And if you’ve ever read any of his papers, you’ll know why. His work is fantastic.

Anyway, after chatting with him, he pointed me to two MATLAB implementations. The first is based on the work by Felzenszwalb et al. and their deformable parts model.

The second method is implemented by Tomasz himself for his Exemplar SVM project which he used for his dissertation and his ICCV 2011 paper, Ensemble of Exemplar-SVMs for Object Detection and Beyond. It’s important to note that Tomasz’s method is over 100x faster than the Felzenszwalb et al. method. And when you’re executing your non-maximum suppression function millions of times, that 100x speedup really matters.

I’ve implemented both the Felzenszwalb et al. and Tomasz et al. methods, porting them from MATLAB to Python. Next week we’ll start with the Felzenszwalb method, then the following week I’ll cover Tomasz’s method. While Tomasz’s method is substantially faster, I think it’s important to see both implementations so we can understand exactly why his method obtains such drastic speedups.

Be sure to stick around and check out these posts! These are absolutely critical steps to building object detectors of your own!

Summary

In this blog post we had a little bit of a history lesson regarding object detectors. We also had a sneak peek into a Python framework that I am working on for object detection in images.

From there we had a quick review of how the Histogram of Oriented Gradients method is used in conjunction with a Linear SVM to train a robust object detector.

However, no matter what method of object detection you use, you will likely end up with multiple bounding boxes surrounding the object you want to detect. In order to remove these redundant boxes you’ll need to apply Non-Maximum Suppression.

Over the next two weeks I’ll show you two implementations of Non-Maximum Suppression that you can use in your own object detection projects.

Be sure to enter your email address in the form below to receive an announcement when these posts go live! Non-Maximum Suppression is absolutely critical to obtaining an accurate and robust object detection system using HOG, so you definitely don’t want to miss these posts!

from: http://www.pyimagesearch.com/2014/11/10/histogram-oriented-gradients-object-detection/

梯度方向直方图和物体检测Histogram of Oriented Gradients and Object Detection相关推荐

  1. 详解3D物体检测模型: Voxel Transformer for 3D Object Detection

    本文介绍一个新的的3D物体检测模型:VoTr,论文已收录于ICCV 2021. 这是第一篇使用 voxel-based Transformer 做3D 主干网络,用于点云数据3D物体检测.由于有限的感 ...

  2. 机器视觉 Histogram of oriented gradients

    Histogram of oriented gradients 简称 HoG, 是计算机视觉和图像处理领域一种非常重要的特征,被广泛地应用于物体检测,人脸检测,人脸表情检测等. HoG 最早是在200 ...

  3. HOG(histogram of oriented gradients)特征个人总结

    HOG(histogram of oriented gradients)特征个人总结 前述 HOG特征概述 HOG特征提取 图像预处理 计算梯度 计算区间(cell)直方图 归一化块(block) 计 ...

  4. Histogram of Oriented Gradients

    原文链接:https://www.learnopencv.com/histogram-of-oriented-gradients/ In this post, we will learn the de ...

  5. HOG特征详解:Histograms of Oriented Gradients for Human Detection

    参考论文<Histograms of Oriented Gradients for Human Detection> 花了一天多的时间,整理了一下HOG特征.接下来就HOG特征进行一些解释 ...

  6. 文献翻译1:Oriented R-CNN for Object Detection

    文献翻译1:Oriented R-CNN for Object Detection Oriented R-CNN for Object Detection Abstract Introduction ...

  7. Oriented R-CNN for Object Detection 论文解读

    论文基本信息 标题:Oriented R-CNN for Object Detection 作者:Xingxing Xie Gong Cheng* Jiabao Wang Xiwen Yao Junw ...

  8. 一种新的无监督前景目标检测方法 A New Unsupervised Foreground Object Detection Method

    14.一种新的无监督前景目标检测方法 A New Unsupervised Foreground Object Detection Method 摘要:针对基于无监督特征提取的目标检测方法效率不高的问 ...

  9. HOG(Histogram of Oriented gradients) feature extraction

    转载 https://blog.csdn.net/liulina603/article/details/8291093 1.HOG特征: 方向梯度直方图(Histogram of Oriented G ...

最新文章

  1. AutoX“真无人”车队驶上繁忙街头,中国正式进无人驾驶时代
  2. Oceanus:美团HTTP流量定制化路由的实践
  3. 宁波大学计算机网络实验五,宁波大学计算机网络实验答案.doc
  4. Leetcode PHP题解--D16 922. Sort Array By Parity II
  5. 计算机图形学学习报告,计算机图形学学习报告.pdf
  6. 我为什么要写FansUnion个人官网-BriefCMS-电子商务malling等系统
  7. 订阅号 图文回复php,微信开发(PHP实现订阅号的公众号配置和自动回复)
  8. 实际的Reactor操作–检索Cloud Foundry应用程序的详细信息
  9. FreeSql (三十)读写分离
  10. php a链接怎么传id_PHP函数参数的传递
  11. 文字垂直居中(HTML、CSS)
  12. SpringBoot添加FastJson消息转换器(自用)
  13. 【AI视野·今日CV 计算机视觉论文速览 第192期】Thu, 6 May 2021
  14. 电子信息工程考研专业方向解读
  15. matlab傅里叶反变换举例,matlab傅里叶正逆变换详细说明+图例
  16. 即使是庸才我也要成为庸才中的人才
  17. systemctl的常用命令和使用说明
  18. 需求跟踪系列 III - 建立需求关联最佳的方式和时间点
  19. 在html页面中引入jquery
  20. 2019年iOS开发者中心证书生成方法以及极光推送证书使用方法!

热门文章

  1. 深度学习还不能解决什么问题?
  2. 美国进入“非结构化”数据分析新时代
  3. 评价并不精辟,感觉没有什么深度
  4. java 单例模式 泛型_设计模式之架构设计实例(工厂模式、单例模式、反射、泛型等)...
  5. Java8 - 避免代码阻塞的骚操作
  6. 并发编程-06线程安全性之可见性 (synchronized + volatile)
  7. Oracle-分析函数之取上下行数据lag()和lead()
  8. 线性表(一)——顺序表
  9. rdf mysql持久化l_Redis进阶(数据持久化RDF和AOF)
  10. ap的ht模式_AP6256 STA模式操作示例