SIFT算法

文章目录

- SIFT算法
一、特征点，关键点，角点？
二、前置知识
- 1.尺度
- 2.卷积
- 3.高斯函数
- 4.高斯卷积（模糊）
三、SIFT算法的引入
- Harris算法缺陷：
- - 1.尺度问题
  - 2.旋转问题
- SIFT引入：
四、SIFT算法步骤
- 1. Scale-space Extrema Detection——尺度空间极值检测
- 2. Keypoint Localization——关键点定位
- 3. Orientation Assignment——确定主方向
- 4. Keypoint Descriptor——生成描述符
- 5. Keypoint Matching——描述符匹配
五、尺度空间极值检测
- 1. 图像初始化
- 2. 计算高斯金字塔层数
- 3. 生成高斯核
- 4. 生成高斯金字塔
- 5. 生成高斯差分金字塔
- 6. 极值检测
- 7.小结
六、关键点定位与舍弃
- 1. 关键点定位
- 2. 关键点舍弃
- - 1）去除对比度低的点
  - 2）边缘效应
- 3. 小结
七、确定主方向
八、生成描述符
- 1. 前置知识
- - 1）线性插值
  - 2）双线性插值
  - 3）三线性插值
  - 4）插值的应用
- 2. 生成描述符
- - 1）旋转至主方向
  - 2）生成描述符
  - 3）限制
- 3.小结
总结
代码
参考资料

一、特征点，关键点，角点？

特征点：图像中具有某些明显特征的点。一个特征可能是一个像素点，也可能一个特征占据了多个像素点
关键点：图像中具有某些明显特征的点，多指一个像素点
角点：同关键点

这里可能与别处表达有些不同，此处只是想区分一下特征点有时并不等于关键点和角点。

二、前置知识

1.尺度

尺度：用来模拟观察者距离物体的远近的程度，可以大致理解为距离物体远近的距离。
尺度空间：尺度空间就是在多个尺度下观察目标，指的是图像的模糊程度（于不同距离观察物体会有模糊之分）

2.卷积

假设图像为灰度图，仅有灰度值，左侧表格为原始图像中各个像素点的灰度值，中间表格为卷积核，右侧表格为卷积过后的图像。卷积即卷积核与原始图像对应位置相乘再相加，得到新的值。例如左上角的3x3与卷积核进行相乘得到数值139，同理其他也如此。

3.高斯函数

即正态分布函数
一维情况

二维情况

分布情况由中心向外侧分散

4.高斯卷积（模糊）

高斯卷积即用高斯核进行图像卷积操作
高斯核

对图像进行高斯卷积后，图像会变得更加平滑。

解释：卷积前的像素点，在卷积时要根据其中心开始周围的几x几的点进行权值求和，故每一个点的特征都与其周围的点的特征相关，也就是说该点的像素值与周围点的像素值有关，从而使得图像中像素点的变化不那么跳动。

三、SIFT算法的引入

我们知道Harris角点检测算法是用来提取图像中的角点，即图像中相对周围变化比较明显的像素点。假设在相同尺度（相同分辨率下），相同旋转方向（两幅图图像未进行旋转）的情况下，我们利用Harris角点检测算法来探测出两幅图像中的所有角点，再进行比对，即可确定两幅图中角点的匹配，从而达到图像匹配的效果。

Harris算法缺陷：

1.尺度问题

但事实上，在进行图像匹配时，会出现两幅图间尺度（分辨率）不匹配的情况，如下图所示：

在对左侧小图像进行探测时，我们采用了小窗口进行Harris角点检测，从而能够提取中其中的角点，但当我们对右侧放大过后（分辨率不同）的图像进行同样窗口的Harris角点检测时，会发现其不能成功提取到角点。Harris在进行角点检测时，其窗口大小一定的固定的，若根据分辨率进行变动，太过于繁琐。

2.旋转问题

若对两幅图像其中之一进行旋转，则其方向已发生改变，根本无法进行对比，也过于繁琐。

SIFT引入：

因此，2004 年，不列颠哥伦比亚大学的 D.Lowe 在论文Distinctive Image Features from Scale-Invariant Keypoints中提出了一种新算法，即SIFT (Scale-Invariant Feature Transform) ，算法提取关键点并计算其描述符，通过对比描述符来实现图像的匹配，并且具有尺度不变性与旋转不变性。

四、SIFT算法步骤

1. Scale-space Extrema Detection——尺度空间极值检测

2. Keypoint Localization——关键点定位

3. Orientation Assignment——确定主方向

4. Keypoint Descriptor——生成描述符

5. Keypoint Matching——描述符匹配

五、尺度空间极值检测

1. 图像初始化

将图像进行上采样（分辨率扩大一倍），进行一次高斯模糊形成初始图像，作为后续处理的第一幅图

2. 计算高斯金字塔层数

根据图像的分辨率进行计算每一组该生成多少层图像，从而形成金字塔。
将图像的长和宽分别记作m和n，故此图像的分辨率为m*n，则该金字塔的层数N为：

为什么－2的原因后续再谈

3. 生成高斯核

由于要形成不同尺度以及不同模糊程度的图像，所以我们需要不同的高斯核。

可以看出从初始的sigma到后续的s*sigma等，从而形成不同层不同模糊程度的图像。

4. 生成高斯金字塔

一个图像的尺度空间L(x,y,σ)，定义为原始图像I(x,y)与一个可变尺度的2维高斯函数G(x,y,σ)卷积运算，即：

σ是尺度空间因子，它决定了图像的模糊的程度

中间带有色彩的图像为原始图像，进行图像初始化后，对其进行高斯模糊。
由图所示，每一行代表每一组，其中每一列代表一层，从左到右每一层图像模糊程度递增，选取此组倒数第三层图像下采样后的图像，作为下一组的初始图像，在进行高斯模糊，重复此操作，从而生成不同尺度，不同模糊程度的高斯金字塔

5. 生成高斯差分金字塔

本应该使用LOG算子进行高斯金字塔的后续处理，但其处理复杂，DOG算子可以进行替代，在公式推导过后，DOG与LOG算子是非常近似的，推导见大神写的此文https://zhuanlan.zhihu.com/p/442484545

高斯差分金字塔的形成即高斯金字塔的不同层进行作差操作。

注意：因为进行了作差操作，所以高斯差分金字塔的层数相比高斯金字塔少1层

6. 极值检测

SIFT算法中初步判断该像素点是否为关键点的方法，即将此点与其所在层和上下两层的3x3区域的26点进行对比，若此点像素值大于或小于其他点，则认定此点为关键点。通过遍历不同层与不同位置，从而初步找到所有的关键点。

提取后形成如下带有关键点信息的图像

注意：每一个关键点的比较必须与当前层和上下两层进行比较，所以第一层与最后一层无法拥有关键点，故此后形成的带有初始关键点信息的图像的高斯差分金字塔，会比之前少2层。所以综上，高斯金字塔必须至少拥有4层，从而至少3层高斯差分金字塔，进而在提取关键点时，可以在中间一层图像提取关键点吗信息。（这里有可能有错误，但意思是这个意思）

7.小结

根据以上步骤，我们已经做到在不同尺度下（分辨率下）提取关键点，故此算法拥有了尺度不变形
以下是我忘了出处的两幅生成高斯金字塔的图，被我用到了PPT中（嘻嘻）。再看以下的图，尺度空间极值检测这个过程就比较好理解了。

六、关键点定位与舍弃

1. 关键点定位

此前我们提取到了图像中的关键点，即知道关键点所在像素区域的坐标–像素坐标，但由于像素坐标都是离散的，例如（0,0），（1,1），我们想知道关键点的具体位置–亚像素位置，例如（1.1,2）。举例如图所示（可能图文无关 ^ ^ 但意思是这意思）：

那么怎么找这个具体位置呢，在检测到的关键点处，对其进行三元二阶泰勒展开

解出其具体位置。

2. 关键点舍弃

1）去除对比度低的点

对比度低的点对噪声敏感，高对比度的图像有较好的清晰度和细节，故在处理时，会对比该点的灰度值是否大于某个门限值contrast_threshold，若大于，则留下进一步处理，否则舍弃该点。

2）边缘效应

边缘效应是什么并不好解释，请参考B站大神“会飞的吴克”，其中有边缘效应的解释，在此只是参照大神基础的一些自己的理解，并不一定正确。
我们在上面讲过高斯卷积用到的高斯核具有平滑的作用，即将通过高斯卷积可以将图片进行“模糊”处理，使得我们得到的每一个新的像素点的值，都与其周围像素点的值相关，从而使得图像信息不跳动，更加平缓。
低通滤波器
滤波器在信号处理中很常见，而低频滤波器的作用就是在信号传输中通过低频信号，阻断高频信号，而在图像领域，我们可以透过它来理解边缘效应。图像中比较暗的点，我们称为低频点，比较亮的点，称为高频点，那么下图红点即为高频点，即亮度相比周围有明显变化的像素点，在经过高斯卷积后，图像变为灰色，即高频（亮度变化明显）点被去除，所以高斯核可近似理解为低通滤波器。

原因：高斯核使得周围点的信息影响到中间点的信息，所以高频点被平滑处理了

高通滤波器
高通滤波器即通高频，阻低频。
在卷积操作中，不同的核可以有不同的卷积结果，下面讲sobel核：

sobel核的特点很明显，即中间一行，一列均为0，这在进行卷积时候就会导致其周围3x3区域内，同行和同列的权重为0，所以就会相当于是下面一行减上面一行，右边一列减左边一列，从而形成了一个作差的操作，也就是dx和dy的操作，有什么作用呢？

在Harris角点检测时我们提到角点检测有3种情况：
① 该点的各个方向上梯度变化都很明显–角点
② 该点在某个方向上梯度变化比较明显，其余无变化–边
③ 该点在各个方向上梯度均无明显变化–平坦区域
所以，如若sobel形成了某种作差操作，那么我们就可以很明显的知道该关键点的周围方向是否有明显变化，从而可以用于边缘检测，Canny Edge Detection中就是用到了sobel核来进行的边缘检测。
若运用到下图中，我们可以清楚的看到，平坦区域（灰色区域）在用sobel核进行卷积过后，都变成了黑色，即作差后值均为0，而高频点在作差过后得到了保留，因为其与周围点的像素值不同，所以sobel核起到了近似高通滤波器的作用，保留了高频点，即保留了相对周围变化比较明显的点。

原因：当前点与水平，垂直方向的变化相减，故低频区域卷积后均为0，高频有变化，仅留下高频
是不是所有具有作差性质的核都可作为此处的高通滤波器呢？不清楚

带通滤波器Band Pass

既然高斯卷积过后只留下了低频区域，sobel核卷积之后只留下了高频区域，那么也就是说高斯卷积过后得到平坦区域得到保留，sobel核卷积后关键点得到保留，那么剩下的不就只剩下边了？那么这就是边缘效应（……），什么意思呢？意思就是我们在上一步检测到的关键点，有可能是真正的关键点，即高频区域，但还有一些关键点可能是一些边上的关键点，因为SIFT算法不同于Harris算法，SIFT算法只是根据了其上下两层和本层的26个点进行了关键点的比较，从而确定出其为关键点，但是Harris确定关键点的方式是比较其各个方向的梯度变化。
那么既然如此，SIFT算法就会存在我找到的某个关键点，它其实不是关键点，而只是某一条边上的像素值相比周围26点的像素值比较大或比较小的点，即不是我们想要的点，要去除，所以我们要去除边缘效应

PS:此处与滤波器等的类比可能会有误，但是感觉理解的应该没什么问题，想表达的意思就是加粗的这一句话

那么现在我们已经知道了SIFT算法中用到的DOG算子存在边缘效应，为什么存在边缘效应呢？
因为我们在尺度空间极值检测时，构造过两个金字塔，一个是高斯金字塔，使用不同尺度以及不同的高斯核进行的高斯卷积；另一个是高斯差分金字塔，是通过高斯金字塔相邻层的作差操作得来的（这里理解可能会有误），那么既有卷积又有作差，岂不是会产生边缘效应？

好的，我们已经知道了为什么存在边缘效应，那么边缘效应怎么去除呢？
直接放我的PPT吧不想打字了因为我没彻底完全理解 ^ ^

3. 小结

通过了关键点定位得到关键点的亚像素位置，以及通过去除对比度低的点和去除边缘效应，从而得到我们最终的关键点。

注：此时的关键点即为最终的关键点

七、确定主方向

在前面我们已经彻底解决了尺度不变性，以及关键点的具体位置，那么接下来要解决的就是旋转不变性。我们采用梯度方向直方图的方法来取确定该关键点的主方向。

用一个半径为r的圆圈，圈住的像素范围对其进行梯度方向直方图的统计，梯度方向的统计分为8个方向，即将360分为8份，45度为1份，统计以该关键点为中心的，半径为r的圆锁圈住的像素点的梯度幅值，通过高斯加权到对应的梯度方向上，从而统计出下图右侧的直方图，直方图的值最大的方向即为该关键点的主方向，而若有其他直方图的幅值大于最大幅值的80%（自己设的）时，则认为此方向为该关键点的辅方向。

问题1：半径r的大小怎么确定？

问题2：关键点辅方向的意思？
答：一个关键点可能只有一个主方向，辅方向可以有可以没有。若只有一个主方向没有辅方向，那么该关键点的位置与主方向就都确定了，那如果有辅方向呢？即此刻该关键点的方向就有两个或多个了，我们一般对于有辅方向的关键点采取以下措施处：

即当做多个关键点进行处理

八、生成描述符

1. 前置知识

1）线性插值

如下图，已知两点，求直线上任意一点，很简单我们可以化出右侧红框的一个式子，观察这个式子很容易发现，y0是左侧的点，y1是右侧的点，但y0所乘的值为x到x1之间占x0到x1之间的权重，而y1所乘的值为x0到x之间占x0到x1之间的权重，即乘的是离自己较远的一边的权重。

如下图，记作A，B，P点，A与P之间的权重值记作P。

那么我们可以得出 P = A *（1-t）+ B * t ，在图像处理中，插值的存在很普遍，我们把线性插值记作lerp。
lerp函数的实现很简单，如下

template<typename T>
inline T lerp(const T &lo, const T &hi, float t)
{ return lo * (1 - t) + hi * t; }

故我们可以记作P = lerp（A，B，t）

2）双线性插值

如图所示，求P点。

我们可以利用三次线性插值来进行求取，如下：
nx0 = lerp（c00，c10，tx）
nx1 = lerp（c01，c11，tx）
P = lerp（nx0，nx1，ty）

那么可以很轻易的得出：

P = c00（1-tx） + c10tx（1-ty） + c01（1-tx）ty + c11txty

3）三线性插值

c00 = lerp(c000,c100,tx)
c10 = lerp(c010,c110,tx)
c01 = lerp(c001,c101,tx)
c11 = lerp(c011,c111,tx)
c0 = lerp(c00,c10b, ty)
c1 = lerp(c01, c11, ty)
c = lerp（c0，c1，tz）

4）插值的应用

插值在SIFT中的处理更类似于权重的关系，我们先来看看三线性插值在HOG算法中的应用，进行对比即可。
HOG算法将梯度方向由0到π以20度分为9份，负值分配到正值上
0 20 40 60 80 100 120 140 160

假设P点为蓝框所在像素点，其梯度方向为85度，梯度幅值为100
设其距离左右上下的距离分别为2、6、2、6

则其在x，y方向的梯度插值，根据距离，可知x，y方向上的权重分配系数均为6/8 2/8，故左上角的幅值分配为：100 * 6/8*6/8 = 56.2
梯度方向分配：85介于80与100之间，分配到85的梯度值为 56.2 * 15/20 = 42.1875，风波路带100的梯度值为56.2 * 5/20 = 14.0625

即插值其实用于权重的分配，比如线性插值，我已知了P点，想去求A点？那么P点分配到A点的权重就是P点的赋值*（1-t），
同理，双线性插值时，去求P点分配到左上角C00的值时，就是P点的赋值乘（1-tx）（1-ty）
那么三线性插值的应用其实就是双线性插值x，y方向分配的基础上多了一个方向的分配，不知理解了没有，反正我是理解的透透的了。

那么到此我们就理解了三线性插值在HOG算法中的应用了，其在SIFT算法中其实同理，在下面讲。

2. 生成描述符

在前述我们已经解决了关键点的尺度不变性以及旋转不变性，那么接下来就要生成描述符，首先我们要理解生成描述符是为了什么？当然是为了后续进行图像的匹配比较，仅需对比描述符即可确定两幅图之间关键点的匹配。
类似于主方向的确定，生成关键点的描述符也需要在某个区域内生成，在原论文中给出的半径r为：

将关键点周围的区域，分成44的16个子区域，每个子区域由88 64个像素构成，在每个子区域内的子区域（22），统计其在8个（36）个区域上的梯度（方向，赋值，经过高斯加权的），故每个子区域有8个数，448 = 128个向量，按顺序写出来，即为描述符。
d：4，44的大格
m：3
mσ：4*4的小格
求中每一个最小的方格代表一个像素。

1）旋转至主方向

在上一步我们求得关键点的主方向和辅方向，为什么要进行旋转呢？因为在两幅图之间进行匹配时候，旋转会导致其坐标发生变化，所以我们要将每一个关键点，都要旋转到其主方向，作为其参考坐标系，后续进行匹配的时候就不会受到旋转的影响。

旋转后的坐标记作x’ y’，右侧为一个旋转矩阵和原坐标。

2）生成描述符

那么其梯度方向与幅值怎么进行统计呢？即用到我们之前讲的三线性插值来进行统计，我们已经知道HOG算法中三线性插值的应用，此处的应用与HOG算法相同，只不过方向只有8个，在关键点周围的16个区域内，以各个区域中心为原点（我们记作种子点），计算周围区域分配到种子点的梯度方向与梯度权值，从而加权得出各个方向的梯度权值，形成8个方向的向量。那么中共16个格子，加起来即为128个向量。

注意：此时我们会用一个三维数组来存取此128个向量，行、列、8个方向值，即4x4x8的三维数组。

3）限制

在进行实际应用时，我们会将三维数组转为一维向量，为了排除图像收到光照的影响（会有噪声），我们要对128个向量进行归一化处理。其中描述子向量为H=（h1，h2…，h128），归一化后的向量为L=（L1，L2…，L128）。

此外，为了防止相机饱和度变化对某些方向的梯度值影响过大，而对方向值的影响微弱，因此设置门限值（归一化后），一般取0.2，梯度值大于0.2的令它等于0.2，小于0.2的不变。

3.小结

此刻我们完整的讲述了SIFT算法，后续要进行的即为图像之间的匹配了，而匹配只需要比较其描述符即可，在此就不进行后续讲解了。

总结

理解SIFT算法需要查阅很多很多资料，其中细节之处更是越挖越深，看了将近一个月才得以总结出此文，而SIFT算法如今的应用却没有那么广泛，毕竟它实现起来速度并不快（也可能是我菜），后续的SURF，ORB算法都比SIFT容易理解，SIFT算法的Python代码详解放在下面，有注释！！！！！！

代码

pysift.py

from numpy import all, any, array, arctan2, cos, sin, exp, dot, log, logical_and, roll, sqrt, stack, trace, unravel_index, pi, deg2rad, rad2deg, where, zeros, floor, full, nan, isnan, round, float32
from numpy.linalg import det, lstsq, norm
from cv2 import resize, GaussianBlur, subtract, KeyPoint, INTER_LINEAR, INTER_NEAREST
from functools import cmp_to_key
import logging####################
# Global variables #
####################logger = logging.getLogger(__name__)
float_tolerance = 1e-7#################
# Main function #
#################def computeKeypointsAndDescriptors(image, sigma=1.6, num_intervals=3, assumed_blur=0.5, image_border_width=5):"""Compute SIFT keypoints and descriptors for an input image"""image = image.astype('float32') # 转换为float类型base_image = generateBaseImage(image, sigma, assumed_blur)  # 原图上采样并模糊后的初始图像num_octaves = computeNumberOfOctaves(base_image.shape)  # 计算高斯金字塔层数gaussian_kernels = generateGaussianKernels(sigma, num_intervals)  # 生成高斯核gaussian_images = generateGaussianImages(base_image, num_octaves, gaussian_kernels)  # 生成高斯金字塔dog_images = generateDoGImages(gaussian_images)  # 生成高斯差分金字塔keypoints = findScaleSpaceExtrema(gaussian_images, dog_images, num_intervals, sigma, image_border_width)  # 找尺度空间极值点keypoints = removeDuplicateKeypoints(keypoints)  # 去除重复的关键点keypoints = convertKeypointsToInputImageSize(keypoints)  # 将关键点转换到原图的位置descriptors = generateDescriptors(keypoints, gaussian_images)  # 生成描述符return keypoints, descriptors#########################
# Image pyramid related #
#########################def generateBaseImage(image, sigma, assumed_blur):"""Generate base image from input image by upsampling by 2 in both directions and blurring"""logger.debug('Generating base image...')image = resize(image, (0, 0), fx=2, fy=2, interpolation=INTER_LINEAR) # 上采样放大一倍sigma_diff = sqrt(max((sigma ** 2) - ((2 * assumed_blur) ** 2), 0.01))return GaussianBlur(image, (0, 0), sigmaX=sigma_diff, sigmaY=sigma_diff)  # the image blur is now sigma instead of assumed_blurdef computeNumberOfOctaves(image_shape):"""Compute number of octaves in image pyramid as function of base image shape (OpenCV default)"""return int(round(log(min(image_shape)) / log(2) - 1)) # 计算层数def generateGaussianKernels(sigma, num_intervals):"""Generate list of gaussian kernels at which to blur the input image. Default values of sigma, intervals, and octaves follow section 3 of Lowe's paper."""logger.debug('Generating scales...')num_images_per_octave = num_intervals + 3  # 保证每组的层数大于3k = 2 ** (1. / num_intervals)gaussian_kernels = zeros(num_images_per_octave)  # scale of gaussian blur necessary to go from one blur scale to the next within an octavegaussian_kernels[0] = sigmafor image_index in range(1, num_images_per_octave):sigma_previous = (k ** (image_index - 1)) * sigmasigma_total = k * sigma_previousgaussian_kernels[image_index] = sqrt(sigma_total ** 2 - sigma_previous ** 2)return gaussian_kernelsdef generateGaussianImages(image, num_octaves, gaussian_kernels):"""Generate scale-space pyramid of Gaussian images"""logger.debug('Generating Gaussian images...')gaussian_images = []for octave_index in range(num_octaves):gaussian_images_in_octave = []  # 存每一组中的不同层的图像gaussian_images_in_octave.append(image)  # first image in octave already has the correct blurfor gaussian_kernel in gaussian_kernels[1:]:  # # 循环高斯模糊，形成不同层，加入数组image = GaussianBlur(image, (0, 0), sigmaX=gaussian_kernel, sigmaY=gaussian_kernel)gaussian_images_in_octave.append(image)gaussian_images.append(gaussian_images_in_octave)  # 以一组加入数组octave_base = gaussian_images_in_octave[-3]  # 找倒数第三层 再下采样作为下一层的基层图像image = resize(octave_base, (int(octave_base.shape[1] / 2), int(octave_base.shape[0] / 2)), interpolation=INTER_NEAREST)return array(gaussian_images, dtype=object)  # 返回高斯金字塔def generateDoGImages(gaussian_images):"""Generate Difference-of-Gaussians image pyramid"""logger.debug('Generating Difference-of-Gaussian images...')dog_images = []for gaussian_images_in_octave in gaussian_images:dog_images_in_octave = []for first_image, second_image in zip(gaussian_images_in_octave, gaussian_images_in_octave[1:]):dog_images_in_octave.append(subtract(second_image, first_image))  # ordinary subtraction will not work because the images are unsigned integersdog_images.append(dog_images_in_octave)return array(dog_images, dtype=object)###############################
# Scale-space extrema related #
###############################def findScaleSpaceExtrema(gaussian_images, dog_images, num_intervals, sigma, image_border_width, contrast_threshold=0.04):"""Find pixel positions of all scale-space extrema in the image pyramid"""# 参数  高斯金字塔  高斯差分金字塔 层数 sigma borderlogger.debug('Finding scale-space extrema...')threshold = floor(0.5 * contrast_threshold / num_intervals * 255)  # from OpenCV implementationkeypoints = []for octave_index, dog_images_in_octave in enumerate(dog_images):#  遍历差分金字塔dog_images_in_octave代表一组for image_index, (first_image, second_image, third_image) in enumerate(zip(dog_images_in_octave, dog_images_in_octave[1:], dog_images_in_octave[2:])):# (i, j) is the center of the 3x3 array# 此时 first second third代表第一二三层，for i in range(image_border_width, first_image.shape[0] - image_border_width):for j in range(image_border_width, first_image.shape[1] - image_border_width):# 定下3x3位置# i,j为3x3中心位置if isPixelAnExtremum(first_image[i-1:i+2, j-1:j+2], second_image[i-1:i+2, j-1:j+2], third_image[i-1:i+2, j-1:j+2], threshold):# 确定下了极值点 second_image[i+1,j+1]# 定位极值点（亚像素位置） 泰勒localization_result = localizeExtremumViaQuadraticFit(i, j, image_index + 1, octave_index, num_intervals, dog_images_in_octave, sigma, contrast_threshold, image_border_width)# 此时得到关键点信息if localization_result is not None:keypoint, localized_image_index = localization_resultkeypoints_with_orientations = computeKeypointsWithOrientations(keypoint, octave_index, gaussian_images[octave_index][localized_image_index])# 此时得到关键点的主方向以及辅方向（可没有）for keypoint_with_orientation in keypoints_with_orientations:keypoints.append(keypoint_with_orientation)return keypointsdef isPixelAnExtremum(first_subimage, second_subimage, third_subimage, threshold):"""Return True if the center element of the 3x3x3 input array is strictly greater than or less than all its neighbors, False otherwise"""center_pixel_value = second_subimage[1, 1]if abs(center_pixel_value) > threshold:# 阈值判断if center_pixel_value > 0:return all(center_pixel_value >= first_subimage) and \all(center_pixel_value >= third_subimage) and \all(center_pixel_value >= second_subimage[0, :]) and \all(center_pixel_value >= second_subimage[2, :]) and \center_pixel_value >= second_subimage[1, 0] and \center_pixel_value >= second_subimage[1, 2]elif center_pixel_value < 0:return all(center_pixel_value <= first_subimage) and \all(center_pixel_value <= third_subimage) and \all(center_pixel_value <= second_subimage[0, :]) and \all(center_pixel_value <= second_subimage[2, :]) and \center_pixel_value <= second_subimage[1, 0] and \center_pixel_value <= second_subimage[1, 2]return Falsedef localizeExtremumViaQuadraticFit(i, j, image_index, octave_index, num_intervals, dog_images_in_octave, sigma, contrast_threshold, image_border_width, eigenvalue_ratio=10, num_attempts_until_convergence=5):"""Iteratively refine pixel positions of scale-space extrema via quadratic fit around each extremum's neighbors""""""细化极值点到亚像素级别i，j：3x3中心位置image_index：高斯差分金字塔中每组的图像索引（此时是中间层）octave_index:高斯差分金字塔的组索引num_intervals:层数dog_images_in_octave:高斯差分金字塔sigma：高斯标准差contrast_threshold:对比度阈值image_border_width:检测区域远离图像边缘5个像素图像eigenvalue_ratio:主曲率阈值num_attempts_until_convergence:最大尝试次数"""logger.debug('Localizing scale-space extrema...')extremum_is_outside_image = Falseimage_shape = dog_images_in_octave[0].shapefor attempt_index in range(num_attempts_until_convergence):# need to convert from uint8 to float32 to compute derivatives and need to rescale pixel values to [0, 1] to apply Lowe's thresholdsfirst_image, second_image, third_image = dog_images_in_octave[image_index-1:image_index+2]pixel_cube = stack([first_image[i-1:i+2, j-1:j+2],second_image[i-1:i+2, j-1:j+2],third_image[i-1:i+2, j-1:j+2]]).astype('float32') / 255.# 求关键点的梯度gradient = computeGradientAtCenterPixel(pixel_cube)# hessian矩阵hessian = computeHessianAtCenterPixel(pixel_cube)# 最小二乘拟合求偏移量extremum_update = -lstsq(hessian, gradient, rcond=None)[0]if abs(extremum_update[0]) < 0.5 and abs(extremum_update[1]) < 0.5 and abs(extremum_update[2]) < 0.5:breakj += int(round(extremum_update[0]))i += int(round(extremum_update[1]))image_index += int(round(extremum_update[2]))# 此时i,j,image_index为具体位置？# make sure the new pixel_cube will lie entirely within the image# 若超出限制(边界限制)if i < image_border_width or i >= image_shape[0] - image_border_width or j < image_border_width or j >= image_shape[1] - image_border_width or image_index < 1 or image_index > num_intervals:extremum_is_outside_image = Truebreakif extremum_is_outside_image:logger.debug('Updated extremum moved outside of image before reaching convergence. Skipping...')return Noneif attempt_index >= num_attempts_until_convergence - 1:logger.debug('Exceeded maximum number of attempts without reaching convergence for this extremum. Skipping...')return None"""消除边缘响应：在边缘梯度的方向上主曲率比较大，而沿着边缘方向的主曲率值比较小。DoG算子会产生较强的边缘效应，需要剔除不稳定的边缘响应点。主曲率通过特征点处的海森矩阵求出。如果一个点不在边缘，那么该点在x，y方向上的曲率差不多。海森矩阵的特征值和D（x）曲率是成正比的，但是这样太麻烦了，我们就通过矩阵的迹和行列式来代替。首先，当行列式小于零时舍去该点，再者当不满足下式时，也舍去该点。tr**2 / det < (r+1)**2 / r**2"""functionValueAtUpdatedExtremum = pixel_cube[1, 1, 1] + 0.5 * dot(gradient, extremum_update)# 首先去除对比度低的点（对比度低，对噪声敏感----高对比度有好的图像的清晰度、细节表现、灰度层次） contrast_thresholdif abs(functionValueAtUpdatedExtremum) * num_intervals >= contrast_threshold:xy_hessian = hessian[:2, :2]xy_hessian_trace = trace(xy_hessian)xy_hessian_det = det(xy_hessian)# 判断主曲率阈值是否在预设阈值之下if xy_hessian_det > 0 and eigenvalue_ratio * (xy_hessian_trace ** 2) < ((eigenvalue_ratio + 1) ** 2) * xy_hessian_det:# Contrast check passed -- construct and return OpenCV KeyPoint objectkeypoint = KeyPoint()# 坐标 组数 邻域直径 关键点响应强度# pt为Point2f类, x ykeypoint.pt = ((j + extremum_update[0]) * (2 ** octave_index), (i + extremum_update[1]) * (2 ** octave_index))  # *2意思是我下采样后的坐标要乘回去keypoint.octave = octave_index + image_index * (2 ** 8) + int(round((extremum_update[2] + 0.5) * 255)) * (2 ** 16)keypoint.size = sigma * (2 ** ((image_index + extremum_update[2]) / float32(num_intervals))) * (2 ** (octave_index + 1))  # octave_index + 1 because the input image was doubledkeypoint.response = abs(functionValueAtUpdatedExtremum)return keypoint, image_indexreturn Nonedef computeGradientAtCenterPixel(pixel_array):"""Approximate gradient at center pixel [1, 1, 1] of 3x3x3 array using central difference formula of order O(h^2), where h is the step size"""# 求梯度# With step size h, the central difference formula of order O(h^2) for f'(x) is (f(x + h) - f(x - h)) / (2 * h)# Here h = 1, so the formula simplifies to f'(x) = (f(x + 1) - f(x - 1)) / 2# NOTE: x corresponds to second array axis, y corresponds to first array axis, and s (scale) corresponds to third array axisdx = 0.5 * (pixel_array[1, 1, 2] - pixel_array[1, 1, 0])dy = 0.5 * (pixel_array[1, 2, 1] - pixel_array[1, 0, 1])ds = 0.5 * (pixel_array[2, 1, 1] - pixel_array[0, 1, 1])return array([dx, dy, ds])def computeHessianAtCenterPixel(pixel_array):"""Approximate Hessian at center pixel [1, 1, 1] of 3x3x3 array using central difference formula of order O(h^2), where h is the step size"""# With step size h, the central difference formula of order O(h^2) for f''(x) is (f(x + h) - 2 * f(x) + f(x - h)) / (h ^ 2)# Here h = 1, so the formula simplifies to f''(x) = f(x + 1) - 2 * f(x) + f(x - 1)# With step size h, the central difference formula of order O(h^2) for (d^2) f(x, y) / (dx dy) = (f(x + h, y + h) - f(x + h, y - h) - f(x - h, y + h) + f(x - h, y - h)) / (4 * h ^ 2)# Here h = 1, so the formula simplifies to (d^2) f(x, y) / (dx dy) = (f(x + 1, y + 1) - f(x + 1, y - 1) - f(x - 1, y + 1) + f(x - 1, y - 1)) / 4# NOTE: x corresponds to second array axis, y corresponds to first array axis, and s (scale) corresponds to third array axiscenter_pixel_value = pixel_array[1, 1, 1]dxx = pixel_array[1, 1, 2] - 2 * center_pixel_value + pixel_array[1, 1, 0]dyy = pixel_array[1, 2, 1] - 2 * center_pixel_value + pixel_array[1, 0, 1]dss = pixel_array[2, 1, 1] - 2 * center_pixel_value + pixel_array[0, 1, 1]dxy = 0.25 * (pixel_array[1, 2, 2] - pixel_array[1, 2, 0] - pixel_array[1, 0, 2] + pixel_array[1, 0, 0])dxs = 0.25 * (pixel_array[2, 1, 2] - pixel_array[2, 1, 0] - pixel_array[0, 1, 2] + pixel_array[0, 1, 0])dys = 0.25 * (pixel_array[2, 2, 1] - pixel_array[2, 0, 1] - pixel_array[0, 2, 1] + pixel_array[0, 0, 1])return array([[dxx, dxy, dxs], [dxy, dyy, dys],[dxs, dys, dss]])#########################
# Keypoint orientations #
#########################def computeKeypointsWithOrientations(keypoint, octave_index, gaussian_image, radius_factor=3, num_bins=36, peak_ratio=0.8, scale_factor=1.5):"""Compute orientations for each keypoint""""""keypoint: 关键点信息octave_index: 组数gaussian_image: 对应组数对应层的图像radius_factor: 半径因子num_bins: 直方图柱的数目peak_ratio: 保留辅方向的百分比scale_factor: 尺度因子"""logger.debug('Computing keypoint orientations...')keypoints_with_orientations = []image_shape = gaussian_image.shape# 根据scale去确定花圈半径以及权重值scale = scale_factor * keypoint.size / float32(2 ** (octave_index + 1))  # compare with keypoint.size computation in localizeExtremumViaQuadraticFit()radius = int(round(radius_factor * scale))weight_factor = -0.5 / (scale ** 2)raw_histogram = zeros(num_bins)smooth_histogram = zeros(num_bins)# 采集在3σ邻域窗口内像素的梯度和方向分布特征，从而确定方向for i in range(-radius, radius + 1):region_y = int(round(keypoint.pt[1] / float32(2 ** octave_index))) + iif region_y > 0 and region_y < image_shape[0] - 1:for j in range(-radius, radius + 1):region_x = int(round(keypoint.pt[0] / float32(2 ** octave_index))) + jif region_x > 0 and region_x < image_shape[1] - 1:# 中心差分求偏导dx = gaussian_image[region_y, region_x + 1] - gaussian_image[region_y, region_x - 1]dy = gaussian_image[region_y - 1, region_x] - gaussian_image[region_y + 1, region_x]# 梯度幅值gradient_magnitude = sqrt(dx * dx + dy * dy)# 梯度方向gradient_orientation = rad2deg(arctan2(dy, dx))# 权重占比weight = exp(weight_factor * (i ** 2 + j ** 2))  # constant in front of exponential can be dropped because we will find peaks later# 直方柱histogram_index = int(round(gradient_orientation * num_bins / 360.))raw_histogram[histogram_index % num_bins] += weight * gradient_magnitudefor n in range(num_bins):# 平滑smooth_histogram[n] = (6 * raw_histogram[n] + 4 * (raw_histogram[n - 1] + raw_histogram[(n + 1) % num_bins]) + raw_histogram[n - 2] + raw_histogram[(n + 2) % num_bins]) / 16.# 主方向 最大值orientation_max = max(smooth_histogram)# 找其余的大值     大值>右移1位 且 大值>左移一位  1234  4123orientation_peaks = where(logical_and(smooth_histogram > roll(smooth_histogram, 1), smooth_histogram > roll(smooth_histogram, -1)))[0]# 找辅方向for peak_index in orientation_peaks:peak_value = smooth_histogram[peak_index]if peak_value >= peak_ratio * orientation_max:# Quadratic peak interpolation# The interpolation update is given by equation (6.30) in https://ccrma.stanford.edu/~jos/sasp/Quadratic_Interpolation_Spectral_Peaks.htmlleft_value = smooth_histogram[(peak_index - 1) % num_bins]right_value = smooth_histogram[(peak_index + 1) % num_bins]interpolated_peak_index = (peak_index + 0.5 * (left_value - right_value) / (left_value - 2 * peak_value + right_value)) % num_binsorientation = 360. - interpolated_peak_index * 360. / num_binsif abs(orientation - 360.) < float_tolerance:orientation = 0new_keypoint = KeyPoint(*keypoint.pt, keypoint.size, orientation, keypoint.response, keypoint.octave)keypoints_with_orientations.append(new_keypoint)return keypoints_with_orientations##############################
# Duplicate keypoint removal #
##############################def compareKeypoints(keypoint1, keypoint2):"""Return True if keypoint1 is less than keypoint2"""if keypoint1.pt[0] != keypoint2.pt[0]:return keypoint1.pt[0] - keypoint2.pt[0]if keypoint1.pt[1] != keypoint2.pt[1]:return keypoint1.pt[1] - keypoint2.pt[1]if keypoint1.size != keypoint2.size:return keypoint2.size - keypoint1.sizeif keypoint1.angle != keypoint2.angle:return keypoint1.angle - keypoint2.angleif keypoint1.response != keypoint2.response:return keypoint2.response - keypoint1.responseif keypoint1.octave != keypoint2.octave:return keypoint2.octave - keypoint1.octavereturn keypoint2.class_id - keypoint1.class_iddef removeDuplicateKeypoints(keypoints):"""Sort keypoints and remove duplicate keypoints"""# 对关键点进行排序，去除重复关键点if len(keypoints) < 2:return keypointskeypoints.sort(key=cmp_to_key(compareKeypoints))unique_keypoints = [keypoints[0]]for next_keypoint in keypoints[1:]:last_unique_keypoint = unique_keypoints[-1]if last_unique_keypoint.pt[0] != next_keypoint.pt[0] or \last_unique_keypoint.pt[1] != next_keypoint.pt[1] or \last_unique_keypoint.size != next_keypoint.size or \last_unique_keypoint.angle != next_keypoint.angle:unique_keypoints.append(next_keypoint)return unique_keypoints#############################
# Keypoint scale conversion #
#############################def convertKeypointsToInputImageSize(keypoints):"""Convert keypoint point, size, and octave to input image size"""converted_keypoints = []for keypoint in keypoints:keypoint.pt = tuple(0.5 * array(keypoint.pt))keypoint.size *= 0.5keypoint.octave = (keypoint.octave & ~255) | ((keypoint.octave - 1) & 255)converted_keypoints.append(keypoint)return converted_keypoints#########################
# Descriptor generation #
#########################def unpackOctave(keypoint):"""Compute octave, layer, and scale from a keypoint"""octave = keypoint.octave & 255layer = (keypoint.octave >> 8) & 255if octave >= 128:octave = octave | -128scale = 1 / float32(1 << octave) if octave >= 0 else float32(1 << -octave)return octave, layer, scaledef generateDescriptors(keypoints, gaussian_images, window_width=4, num_bins=8, scale_multiplier=3, descriptor_max_value=0.2):"""Generate descriptors for each keypoint""""""keypoints: 关键点gaussian_images: 高斯金字塔图像window_width: 关键点附近区域长度num_bins: 8个方向的梯度直方图scale_multiplier: 求取极值的尺度多少  缩放倍数descriptor_max_value: 描述符最大值"""logger.debug('Generating descriptors...')descriptors = []for keypoint in keypoints:# 组 层 尺度octave, layer, scale = unpackOctave(keypoint)gaussian_image = gaussian_images[octave + 1, layer]num_rows, num_cols = gaussian_image.shape# round四舍五入point = round(scale * array(keypoint.pt)).astype('int')bins_per_degree = num_bins / 360.angle = 360. - keypoint.angle# 角度转弧度 用来旋转主方向cos_angle = cos(deg2rad(angle))sin_angle = sin(deg2rad(angle))weight_multiplier = -0.5 / ((0.5 * window_width) ** 2)  # 高斯加权运算row_bin_list = []  # 存放每个邻域点对应4x4小窗口的哪一个（行）col_bin_list = []  # 存放每个邻域点对应4x4小窗口的哪一个（列）magnitude_list = []  # 存放每个邻域点的梯度幅值orientation_bin_list = []  # 存放每个邻域点的梯度方向角所处的方向组# 三维数组，存储最后的128维向量   长  宽 8个方向histogram_tensor = zeros((window_width + 2, window_width + 2, num_bins))   # first two dimensions are increased by 2 to account for border effects# Descriptor window size (described by half_width) follows OpenCV conventionhist_width = scale_multiplier * 0.5 * scale * keypoint.size  # 3×sigma，每个小窗口的边长 即  m sigma 4*3/2 ？half_width = int(round(hist_width * sqrt(2) * (window_width + 1) * 0.5))   # sqrt(2) corresponds to diagonal length of a pixelhalf_width = int(min(half_width, sqrt(num_rows ** 2 + num_cols ** 2)))     # ensure half_width lies within image# 中心点的半宽# 坐标轴旋转至主方向# 关键点为中心的4*4=16个子区域内进行遍历像素for row in range(-half_width, half_width + 1):for col in range(-half_width, half_width + 1):# 旋转过后的坐标row_rot = col * sin_angle + row * cos_anglecol_rot = col * cos_angle - row * sin_angle# 对应4×4子区域的下标（行）（列）row_bin = (row_rot / hist_width) + 0.5 * window_width - 0.5col_bin = (col_rot / hist_width) + 0.5 * window_width - 0.5# 保证邻域的点在旋转后，仍然处于4×4的区域内if row_bin > -1 and row_bin < window_width and col_bin > -1 and col_bin < window_width:# 计算对应原图的row colwindow_row = int(round(point[1] + row))window_col = int(round(point[0] + col))if window_row > 0 and window_row < num_rows - 1 and window_col > 0 and window_col < num_cols - 1:# 直接在旋转前的图上计算梯度，因为旋转时，都旋转了，不影响大小dx = gaussian_image[window_row, window_col + 1] - gaussian_image[window_row, window_col - 1]dy = gaussian_image[window_row - 1, window_col] - gaussian_image[window_row + 1, window_col]# 梯度幅值gradient_magnitude = sqrt(dx * dx + dy * dy)# 梯度方向gradient_orientation = rad2deg(arctan2(dy, dx)) % 360weight = exp(weight_multiplier * ((row_rot / hist_width) ** 2 + (col_rot / hist_width) ** 2))row_bin_list.append(row_bin)col_bin_list.append(col_bin)magnitude_list.append(weight * gradient_magnitude)orientation_bin_list.append((gradient_orientation - angle) * bins_per_degree)# 将magnitude分配到4*4*8（d*d*num_bins）的各区域中，即分配到histogram_tensor数组中# 子行 子列 权重 子方向for row_bin, col_bin, magnitude, orientation_bin in zip(row_bin_list, col_bin_list, magnitude_list, orientation_bin_list):# Smoothing via trilinear interpolation# Notations follows https://en.wikipedia.org/wiki/Trilinear_interpolation# Note that we are really doing the inverse of trilinear interpolation here (we take the center value of the cube and distribute it among its eight neighbors)# 三线性插值row_bin_floor, col_bin_floor, orientation_bin_floor = floor([row_bin, col_bin, orientation_bin]).astype(int)row_fraction, col_fraction, orientation_fraction = row_bin - row_bin_floor, col_bin - col_bin_floor, orientation_bin - orientation_bin_floorif orientation_bin_floor < 0:orientation_bin_floor += num_binsif orientation_bin_floor >= num_bins:orientation_bin_floor -= num_bins# 先按行进行线性插值c1 = magnitude * row_fraction  # 幅值 * 插值c0 = magnitude * (1 - row_fraction)  # 幅值 * (1 - 插值)# 对每一个行向量，进行y方向的线性插值c11 = c1 * col_fractionc10 = c1 * (1 - col_fraction)c01 = c0 * col_fractionc00 = c0 * (1 - col_fraction)# 再分配到对应的梯度方向上c111 = c11 * orientation_fractionc110 = c11 * (1 - orientation_fraction)c101 = c10 * orientation_fractionc100 = c10 * (1 - orientation_fraction)c011 = c01 * orientation_fractionc010 = c01 * (1 - orientation_fraction)c001 = c00 * orientation_fractionc000 = c00 * (1 - orientation_fraction)# 所有的数值存到4*4*8的128维数组，形成128维向量histogram_tensor[row_bin_floor + 1, col_bin_floor + 1, orientation_bin_floor] += c000histogram_tensor[row_bin_floor + 1, col_bin_floor + 1, (orientation_bin_floor + 1) % num_bins] += c001histogram_tensor[row_bin_floor + 1, col_bin_floor + 2, orientation_bin_floor] += c010histogram_tensor[row_bin_floor + 1, col_bin_floor + 2, (orientation_bin_floor + 1) % num_bins] += c011histogram_tensor[row_bin_floor + 2, col_bin_floor + 1, orientation_bin_floor] += c100histogram_tensor[row_bin_floor + 2, col_bin_floor + 1, (orientation_bin_floor + 1) % num_bins] += c101histogram_tensor[row_bin_floor + 2, col_bin_floor + 2, orientation_bin_floor] += c110histogram_tensor[row_bin_floor + 2, col_bin_floor + 2, (orientation_bin_floor + 1) % num_bins] += c111# histogram_tensor为当前关键点的128维向量，要对其进行归一化处理，下述化成了一维descriptor_vector = histogram_tensor[1:-1, 1:-1, :].flatten()  # Remove histogram borders# 归一化描述符# Threshold and normalize descriptor_vector# 描述子的门限化，大于0.2就等于0.2，小于0.2的保持不变threshold = norm(descriptor_vector) * descriptor_max_valuedescriptor_vector[descriptor_vector > threshold] = threshold# 归一化 norm范数 模 即h1 + ..... hj  归一化处理descriptor_vector /= max(norm(descriptor_vector), float_tolerance)# Multiply by 512, round, and saturate between 0 and 255 to convert from float32 to unsigned char (OpenCV convention)descriptor_vector = round(512 * descriptor_vector)descriptor_vector[descriptor_vector < 0] = 0descriptor_vector[descriptor_vector > 255] = 255descriptors.append(descriptor_vector)return array(descriptors, dtype='float32')

template_matching_demo.py – 读取图片和匹配的代码

import numpy as np
import cv2
import pysift
from matplotlib import pyplot as plt
import logging
logger = logging.getLogger(__name__)MIN_MATCH_COUNT = 10img1 = cv2.imread('box.png', 0)
img2 = cv2.imread('box_in_scene.png', 0)# Compute SIFT keypoints and descriptors
kp1, des1 = pysift.computeKeypointsAndDescriptors(img1)
kp2, des2 = pysift.computeKeypointsAndDescriptors(img2)# Initialize and use FLANN
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks = 50)flann = cv2.FlannBasedMatcher(index_params, search_params)
"""返回的俩个DMatch数据结构类型，queryIdx，trainIdx，distancequeryIdx：测试图像的特征点描述符的下标（第几个特征点描述符），同时也是描述符对应特征点的下标。trainIdx：样本图像的特征点描述符下标,同时也是描述符对应特征点的下标。distance：代表这一次匹配的特征点描述符的欧式距离，数值越小也就说明俩个特征点越相近。----相似性的度量这俩个DMatch数据类型是俩个与原图像特征点最接近的俩个特征点（match返回的是最匹配的）只有这俩个特征点的欧式距离小于一定值的时候才会认为匹配成功。
"""
matches = flann.knnMatch(des1, des2, k=2)# Lowe's ratio test
good = []
for m, n in matches:if m.distance < 0.7 * n.distance:# 匹配成功放入goodgood.append(m)if len(good) > MIN_MATCH_COUNT:# Estimate homography between template and scenesrc_pts = np.float32([ kp1[m.queryIdx].pt for m in good]).reshape(-1, 1, 2)dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good]).reshape(-1, 1, 2)M = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)[0]# Draw detected template in scene imageh, w = img1.shapepts = np.float32([[0, 0],[0, h - 1],[w - 1, h - 1],[w - 1, 0]]).reshape(-1, 1, 2)dst = cv2.perspectiveTransform(pts, M)img2 = cv2.polylines(img2, [np.int32(dst)], True, 255, 3, cv2.LINE_AA)# 组合两幅图h1, w1 = img1.shapeh2, w2 = img2.shapenWidth = w1 + w2nHeight = max(h1, h2)hdif = int((h2 - h1) / 2)newimg = np.zeros((nHeight, nWidth, 3), np.uint8)for i in range(3):newimg[hdif:hdif + h1, :w1, i] = img1newimg[:h2, w1:w1 + w2, i] = img2# Draw SIFT keypoint matchesfor m in good:pt1 = (int(kp1[m.queryIdx].pt[0]), int(kp1[m.queryIdx].pt[1] + hdif))pt2 = (int(kp2[m.trainIdx].pt[0] + w1), int(kp2[m.trainIdx].pt[1]))cv2.line(newimg, pt1, pt2, (255, 0, 0))plt.imshow(newimg)plt.show()
else:print("Not enough matches are found - %d/%d" % (len(good), MIN_MATCH_COUNT))

参考资料

全网的所有资料我几乎都查阅过，肯定会有遗漏，若未注到请私信。

OpenCV官方介绍
Lowe D . Distinctive image features from scale-invariant key points[J]. International Journal of Computer Vision, 2003, 20:91-110.
Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05). Ieee, 2005, 1: 886-893.
https://www.cnblogs.com/fcfc940503/p/11492540.html
https://max.book118.com/html/2015/1221/31774000.shtm
https://blog.csdn.net/qq_33328642/article/details/109660609
https://blog.csdn.net/carrine/article/details/123697849
B站大神：会飞的吴克–BV1Qb411W7cK BV1KK4y1t7qs
B站大神：飛鳥的日常记录–BV1cu411z75M
B站大神：十万伏特丘比特–BV1oh411x7cf

SLAM-Visual Navigation学习之SIFT算法与代码详解相关推荐

《STM32从零开始学习历程》——CAN通讯代码详解
<STM32从零开始学习历程>@EnzoReventon CAN通讯代码详解相关链接: <STM32从零开始学习历程>--CAN通讯协议物理层 CAN-bus规范 V2.0版 ...
强化学习PPO从理论到代码详解(1)--- 策略梯度Policy gradient
第0章闲聊吹水 Proximal Policy Optimization(PPO) 近端策略优化,可以说是目前最稳定,最强的强化学习算法之一了,也是openAI默认的强化学习算法,有多叼不用我说了吧 ...
《机器学习实战》第二章学习笔记：K-近邻算法（代码详解）
<机器学习实战>数据资料以及总代码可以去GitHub中下载: GitHub代码地址:https://github.com/yangshangqi/Machine-Learning-in-A ...
特征点检测 FAST算法及代码详解
本文着重介绍了用于图像特征点检测的算法,FAST算法,以及使用matlab的实现. FAST算法是一种拐点检测算法,其主要应用于提取图像中的特征点,在动态成像的一系列图像中追踪定位对象.众所周知,我们 ...
鲸鱼算法matlab代码详解（一）
主函数 clear all clc SearchAgents_no=30; %此处为搜索代理的数量,也就是种群的数量 Function_name='F1'; %此处为调用目标函数的信息编号 Max_ ...
MeanTeacher文章解读+算法流程+核心代码详解
MeanTeacher 本博客仅做算法流程疏导,具体细节请参见原文原文原文链接点这里 Github 代码 Github代码点这里解读论文解读点这里算法流程代码详解 train_transf ...
c语言实现sha1算法注解,【密码学】SHA1算法实现及详解
1 SHA1算法简介安全哈希算法(Secure Hash Algorithm)主要适用于数字签名标准(Digital Signature Standard DSS)里面定义的数字签名算法(Digit ...
DL之AlexNet：AlexNet算法的架构详解、损失函数、网络训练和学习之详细攻略
DL之AlexNet:AlexNet算法的架构详解.损失函数.网络训练和学习之详细攻略相关文章 Dataset:数据集集合(CV方向数据集)--常见的计算机视觉图像数据集大集合(建议收藏,持续更新) ...
【图像分类】【深度学习】ViT算法Pytorch代码讲解
[图像分类][深度学习]ViT算法Pytorch代码讲解文章目录 [图像分类][深度学习]ViT算法Pytorch代码讲解前言 ViT(Vision Transformer)讲解 patch em ...

SLAM-Visual Navigation学习之SIFT算法与代码详解