聚类生成anchor框的尺寸和比例

前言：

anchor是锚的意思，就是固定船的大铁块儿。在目标检测中，anchor box意为预设固定尺寸的参考框。目标检测要解决的问题是图像中哪个位置有什么样的物体，传统算法的解决方法是采用滑窗的方式，遍历整个图像，判断此位置是否有物体，非常低效耗时。

anchor box的概念首先出现在Faster RCNN中，通过一组9个人工预先设置固定尺寸的框，对经过backbone网络提取的特征图进行遍历，为每一个点都设置这9个固定的先验框，对每个框再采用卷积的方式进行分类(是否包含目标)和回归(anchor的坐标偏移量和缩放因子)，再通过非极大抑制，去除重叠框，实现对目标物体的定位和分类。不同尺寸和ratio的框代表着能够适应不同尺度的目标物体。

先验框是由人为指定的，由此带来的问题是先验框设置的好坏会影响模型的训练以及收敛。在Faster RCNN中，对物体的定位是通过计算偏移量实现的，码！

def regression_box_shift(p, g):"""compute t to transform p to g:param p: proposal box:param g: ground truth:return: t"""w_p = p[2] - p[0]h_p = p[3] - p[1]w_g = g[2] - g[0]h_g = g[3] - g[1]tx = (g[0] - p[0])/w_pty = (g[1] - p[1])/h_ptw = np.log(w_g/w_p)th = np.log(h_g/h_p)t = [tx, ty, tw, th]return t

举个例子来说，如果训练样本中都是尺寸较小的物体，而先验框的尺寸却很大，这意味着在bounding box regression阶段，将需要更多的调整以及更大的调整幅度以使得proposal框更接近ground truth框，从而影响模型的收敛速度。

那么就要求先验框的设置应当能够适应检测样本中的目标尺寸，也就是说，对于检测样本中的不同物体，9个anchor box中总有一个先验框的尺寸很接近物体的尺寸，而不是所有的anchor都偏离目标物体的尺寸。

聚类生成先验框尺寸：

yolov2中率先提出了使用K-means聚类的方法自动生成anchor尺寸，消除了anchor设置的主观性，在使用5个anchor的情况下就能达到Faster RCNN中使用9个anchor的精度，效果很好。

在K-means聚类算法中，主要概念为距离度量函数和聚类中心。对应于anchor聚类，不同的是样本距离度量函数的设置，定义为：

Distance = 1 - IOU

其中Distance为样本间距离，IOU为某个anchor和某ground truth的交并比，计算时将两个box的中心自动对齐，IOU越大，Distance越小，表明两个box尺寸越相近。

算法流程：码！

def kmeans(boxes, k, dist=np.median):# number of boxesbox_num = len(boxes)# store cluster center of each boxnearest_id = np.zeros(box_num)np.random.seed(42)# initialize the clusterclusters = boxes[np.random.choice([i for i in range(box_num)], k, replace=False)]while True:# store iou distance between each pair of boxes and anchorsdistance = []for i in range(box_num):ious = compute_iou(boxes[i], clusters)dis = [1-iou for iou in ious]distance.append(dis)distance = np.array(distance)# calculate box cluster idnew_nearest_id = np.argmin(distance, axis=1)# break conditionif (new_nearest_id == nearest_id).all():break# update clusters using median strategyfor j in range(k):clusters[j] = dist(boxes[new_nearest_id == j], axis=0)nearest_id = new_nearest_idreturn clusters

其中boxes为ground truth标注框数据，实际上只需传入框的高和宽就可以了。k为聚类中心的个数，即需要多少个anchor先验框。dist为更新聚类中心时的策略，本文使用取中间值。算法流程：初始化每个box的聚类中心id，随机选取k个box初始化聚类中心。开始聚类：计算每个box和k个聚类中心的距离，得到mxk大小的distance数组。计算每个box的聚类中心id，根据id采取中位数的策略更新聚类中心进行迭代，如果新旧id不发生变化则完成聚类。

完整代码：

import numpy as np
from glob import glob
input_dim = 1024def compute_iou(box, anchors):# distance = 1 - iou# dis = []ious = []for anchor in anchors:w_min = np.min([box[0], anchor[0]])h_min = np.min([box[1], anchor[1]])intersection = w_min*h_minunion = box[0]*box[1] + anchor[0]*anchor[1]iou = intersection/(union - intersection)# dis.append(1 - iou)ious.append(iou)return iousdef kmeans(boxes, k, dist=np.median):# number of boxesbox_num = len(boxes)# store cluster center of each boxnearest_id = np.zeros(box_num)np.random.seed(42)# initialize the clusterclusters = boxes[np.random.choice([i for i in range(box_num)], k, replace=False)]while True:# store iou distance between each pair of boxes and anchorsdistance = []for i in range(box_num):ious = compute_iou(boxes[i], clusters)dis = [1-iou for iou in ious]distance.append(dis)distance = np.array(distance)# calculate box cluster idnew_nearest_id = np.argmin(distance, axis=1)# break conditionif (new_nearest_id == nearest_id).all():break# update clusters using median strategyfor j in range(k):clusters[j] = dist(boxes[new_nearest_id == j], axis=0)nearest_id = new_nearest_idreturn clustersdef load_dataset(path):# load normalization width and height of boxespath = path + '/*.txt'txt_list = glob(path)data_set = []for txt in txt_list:with open(txt, 'r') as f:lines = f.readlines()for line in lines:coordinate = line.split(' ')w, h = np.array(coordinate[3:5], dtype=np.float64)data_set.append([w, h])data_set = np.array(data_set)return data_setdef main():txt_path = 'C:\\Users\\XQ\\Desktop\\labels'data = load_dataset(txt_path)# number of cluster centerclusters = kmeans(data, 9)print('cluster center:*************')print(clusters*input_dim)accuracy = np.mean([np.max(compute_iou(box, clusters)) for box in data])*100print('Accuracy(Average iou): %.4f%%' % accuracy)anchor_ratio = np.around(clusters[:, 0] / clusters[:, 1], decimals=2)anchor_ratio = list(anchor_ratio)print('Final anchor_ratio: ', anchor_ratio)print('Sorted anchor ratio: ', sorted(anchor_ratio))if __name__ == "__main__":main()

此代码使用的数据格式为yolo模型的数据标签格式，其它类型的也可以啦，自行转换即可。

输出效果：

输出有四项，第一项为k个聚类中心anchor的尺寸，注意乘上input_dim进行转换。第二项为Accuracy，其实为所有box和它的聚类中心anchor的IOU均值，这个值越大表明k个anchor能够适应的标注框越多，效果越好。后两项为anchor长宽比的值。

才疏学浅，欢迎指正！

聚类生成anchor框的尺寸和比例相关推荐

使用 k-means 聚类生成 SSD 锚框纵横比
(这篇文章是TensorFlow Object_detection API 框架中的一篇,用来训练自己模型锚框.) 许多对象检测模型使用锚框作为区域采样策略,因此在训练期间,模型学习将几个预定义的锚框 ...
kmeans以及kmeans++聚类生成anchors
使用yolo系列通常需要通过聚类算法生成anchors,本文给出kmeans以及kmeans++的python实现. 数据格式为VOC的xml文件若数据集不是voc格式,比如coco格式或者txt格 ...
kmeans++聚类生成anchors
kmeans++聚类生成anchors 说明使用yolo系列通常需要通过kmeans聚类算法生成anchors, 但kmeans算法本身具有一定的局限性,聚类结果容易受初始值选取影响. 因此通过改进 ...
饭后时间（四）---SSD先验框的尺寸及计算源码(含代码ssd_anchor.py)
b站:连翘春风冻傻抱蚁人https://www.bilibili.com/video/av45660456 net-> ssd_vgg_300.py 本文研究核心准备先验框的尺寸:因为SSD在 ...
R语言生成仿真的3D高斯簇数据集、使用scale函数进行数据缩放、并使用KMeans进行聚类分析、数据反向缩放并比较聚类生成的中心和实际数据的中心的差异、预测新的数据所属的聚类簇
R语言生成仿真的3D高斯簇数据集.使用scale函数进行数据缩放.并使用KMeans进行聚类分析.数据反向缩放并比较聚类生成的中心和实际数据的中心的差异.预测新的数据所属的聚类簇目录
vba 根据分辨率缩放显示比例_【显示百闻录】第一讲：关于屏幕尺寸、比例以及分辨率...
一屏幕尺寸即屏幕显示区域对角线长度换算成英寸后的数值,如常见的15.6英寸屏幕,屏幕对角线长度为15.6英寸,约为39.6cm. 15.6英寸显示器市场上主流的笔记本屏幕一般分为13英寸到16英 ...
YOLOv3使用自己数据集——Kmeans聚类计算anchor boxes
YOLOv3使用笔记--Kmeans聚类计算anchor boxes 使用自己数据集聚类得到anchors. 相比于作者使用VOC数据集的精度更高. # kmeans 聚类计算anchor boxes ...
python dataset[trans_python gdal根据图像坐标生成矢量框（含图像坐标转地理坐标）...
要生成矢量框需要将图像坐标转换为地理坐标或者投影坐标,以下代码是生成了满足条件的1000*1000区域对应的矢量框,关键在于红色字体部分. # -*- coding: utf-8 -*- import ...
WordPress彻底禁用上传媒体图片自动生成缩略图及多尺寸图片（亲测可用）
WordPress默认上传图片的时候会自动生成缩略图及多尺寸的图片文件,大部分网站都用不到这些多余的图片,不仅仅占用空间,而且上传的时候还会消耗额外的性能. 下面仅需两段函数代码即可彻底禁用该功能. ...

聚类生成anchor框的尺寸和比例

前言：

聚类生成先验框尺寸：

聚类生成anchor框的尺寸和比例相关推荐

最新文章

热门文章