读coco数据集的代码接口了解segmentation的处理方法
读coco数据集的代码接口了解segmentation的处理方法
COCO数据集是微软团队制作的一个数据集,通过这个数据集我们可以训练到神经网络对图像进行detection,classification,segmentation,captioning。具体介绍请祥见官网。
- annotation格式介绍
- mask存储处理方式简单介绍
- 相关代码分析
- 一个实例
annotation格式介绍
//从官网拷贝下来的
{"info": info,"images": [image],"annotations": [annotation],"licenses": [license],
}info{"year": int,"version": str,"description": str,"contributor": str,"url": str,"date_created": datetime,
}image{"id": int,"width": int,"height": int,"file_name": str,"license": int,"flickr_url": str,"coco_url": str,"date_captured": datetime,
}license{"id": int,"name": str,"url": str,
}
----------
Object Instance Annotations
Each instance annotation contains a series of fields, including the category id and segmentation mask of the object. The segmentation format depends on whether the instance represents a single object (iscrowd=0 in which case polygons are used) or a collection of objects (iscrowd=1 in which case RLE is used). Note that a single object (iscrowd=0) may require multiple polygons, for example if occluded. Crowd annotations (iscrowd=1) are used to label large groups of objects (e.g. a crowd of people). In addition, an enclosing bounding box is provided for each object (box coordinates are measured from the top left image corner and are 0-indexed). Finally, the categories field of the annotation structure stores the mapping of category id to category and supercategory names.
中文翻译如下: 每个实例注释包含一系列字段,这些字段有category id和segmentation mask。segementation字段的格式取决于实例是代表单个物体(具体来说iscrowd=0,这时候就会用到polygon,也就是多边形)还是目标的集合体(此时iscrowd=1, 会用到RLE,后面解释这个的意思)。注意到单个目标可能需要多个多边形来表示,例如在被遮挡的情况下。群体注释是用来标注目标的集合体(例如一群人)。除此之外,每个目标都会有一个封闭的外接矩形框来标记(矩形框的坐标从图像的左上角开始记录,没有索引)。最后,类别字段存储着category id到category和父级category名字的映射。
annotation{"id": int,"image_id": int,"category_id": int,"segmentation": RLE or [polygon],"area": float,"bbox": [x,y,width,height],"iscrowd": 0 or 1,
}categories[{"id": int,"name": str,"supercategory": str,
}]
mask存储处理方式简单介绍
上面提到coco数据集使用了两种方式进行mask存储,一是polygon,一是RLE。polygon比较好理解,就是多边形嘛!RLE是什么呢?
简单点来讲,RLE是一种压缩方法,也是最容易想到的压缩方式。
举个例子:M = [0,0,0,1,1,1,1,1,1,0,0],则M的RLE编码为[3,6,2],当然这是针对二进制进行的编码,也是coco里面采用的。RLE远不止这样简单,我们这里并不着重讲RLE,请百度吧。
代码中注释说的
# RLE is a simple yet efficient format for storing binary masks. RLE
# first divides a vector (or vectorized image) into a series of piecewise
# constant regions and then for each piece simply stores the length of
# that piece. For example, given M=[0 0 1 1 1 0 1] the RLE counts would
# be [2 3 1 1], or for M=[1 1 1 1 1 1 0] the counts would be [0 6 1]
# (note that the odd counts are always the numbers of zeros). Instead of
# storing the counts directly, additional compression is achieved with a
# variable bitrate representation based on a common scheme called LEB128.
解释一下就是:RLE将一个二进制向量分成一系列固定长度的片段,对每个片段只存储那个片段的长度。例如M=[0 0 1 1 1 0 1], RLE就是[2 3 1 1];M=[1 1 1 1 1 1 0], RLE为[0 6 1],注意奇数位始终为0的个数。另外,也使用一个基于LEB128的通用方案的可变比特率来完成额外的压缩。
相关代码分析
COCO是官方给出的一个api接口,具体来说是一个python和C编写的工具代码。mask相关内容是用c编写的。//代码来源于FastMaskRCNN**1.convert img and annotation to TFRecord**
//加载标注文件
coco = COCO(annFile)
//加载类别信息
cats = coco.loadCats(coco.getCatIds())
print ('%s has %d images' %(split_name, len(coco.imgs)))
//将img信息转存
imgs = [(img_id, coco.imgs[img_id]) for img_id in coco.imgs]
//获取分片信息
num_shards = int(len(imgs) / 2500)
num_per_shard = int(math.ceil(len(imgs) / float(num_shards)))2.获取coco中的mask,bbox信息
def _get_coco_masks(coco, img_id, height, width, img_name):""" get the masks for all the instancesNote: some images are not annotatedReturn:masks, mxhxw numpy arrayclasses, mx1bboxes, mx4"""annIds = coco.getAnnIds(imgIds=[img_id], iscrowd=None)# assert annIds is not None and annIds > 0, 'No annotaion for %s' % str(img_id)anns = coco.loadAnns(annIds)coco.showAnns(anns)# assert len(anns) > 0, 'No annotaion for %s' % str(img_id)masks = []classes = []bboxes = []mask = np.zeros((height, width), dtype=np.float32)segmentations = []for ann in anns:m = coco.annToMask(ann) # zero one maskassert m.shape[0] == height and m.shape[1] == width, \'image %s and ann %s dont match' % (img_id, ann)masks.append(m)cat_id = _cat_id_to_real_id(ann['category_id'])classes.append(cat_id)bboxes.append(ann['bbox'])m = m.astype(np.float32) * cat_idmask[m > 0] = m[m > 0]masks = np.asarray(masks)classes = np.asarray(classes)bboxes = np.asarray(bboxes)# to x1, y1, x2, y2if bboxes.shape[0] <= 0:bboxes = np.zeros([0, 4], dtype=np.float32)classes = np.zeros([0], dtype=np.float32)print ('None Annotations %s' % img_name)LOG('None Annotations %s' % img_name)bboxes[:, 2] = bboxes[:, 0] + bboxes[:, 2]bboxes[:, 3] = bboxes[:, 1] + bboxes[:, 3]gt_boxes = np.hstack((bboxes, classes[:, np.newaxis]))gt_boxes = gt_boxes.astype(np.float32)masks = masks.astype(np.uint8)mask = mask.astype(np.uint8)assert masks.shape[0] == gt_boxes.shape[0], 'Shape Error'return gt_boxes, masks, mask
一个实例
# get all images containing given categories, select one at random
catIds = coco.getCatIds(catNms=['animal']);
imgIds = coco.getImgIds(catIds=catIds );
imgIds = coco.getImgIds(imgIds = [324139])
img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]print imgIds
print img['coco_url']
I = io.imread(img['coco_url'])
plt.axis('off')
plt.imshow(I)
plt.show() //图一plt.imshow(I); plt.axis('off')
annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco.loadAnns(annIds)
print len(anns)
masks = []
showonce = True
for ann in anns:if type(ann['segmentation']) == list and showonce:print ann['segmentation']showonce = Falseif type(ann['segmentation']) != list:print ann['segmentation']m = coco.annToMask(ann)masks.append(m)
print len(masks)
coco.showAnns(anns) //图二
15
[[151.06, 113.6, 168.95, 102.49, 182.53, 92.62, 193.64, 80.28, 203.51, 70.4, 208.45, 61.76, 206.6, 53.74, 209.68, 49.42, 220.17, 50.04, 220.79, 55.59, 222.64, 59.3, 222.64, 59.91, 227.58, 66.7, 228.2, 77.19, 228.2, 83.98, 228.2, 87.06, 228.2, 92.0, 227.58, 96.32, 220.79, 101.87, 213.39, 104.96, 205.36, 111.75, 202.9, 113.6, 201.04, 114.22, 202.9, 123.47, 200.43, 129.64, 200.43, 125.94, 198.58, 113.6, 190.55, 111.75, 181.91, 113.6, 168.95, 114.83, 168.95, 114.83, 162.17, 120.39, 157.23, 119.15, 146.74, 117.92, 142.42, 115.45]]
{u’counts’: [113441, 1, 423, 6, 427, 7, 3, 1, 422, 8, 3, 1, 421, 9, 2, 2, 420, 10, 2, 1, 421, 10, 1, 2, 419, 12, 1, 2, 418, 15, 419, 15, 418, 16, 418, 16, 418, 15, 419, 15, 418, 16, 417, 16, 418, 16, 418, 15, 419, 14, 419, 14, 419, 14, 420, 13, 420, 13, 421, 11, 422, 11, 423, 10, 423, 10, 424, 9, 425, 8, 426, 7, 427, 6, 427, 5, 429, 5, 429, 5, 429, 5, 429, 5, 428, 5, 429, 5, 428, 6, 428, 6, 428, 6, 428, 5, 429, 5, 429, 5, 408, 7, 14, 5, 407, 9, 13, 5, 406, 11, 11, 5, 407, 12, 9, 6, 406, 13, 9, 6, 404, 15, 8, 7, 402, 18, 7, 7, 401, 20, 6, 6, 401, 21, 6, 6, 400, 34, 399, 35, 399, 12, 1, 22, 398, 13, 3, 19, 399, 12, 6, 17, 399, 12, 8, 15, 399, 12, 10, 13, 400, 10, 13, 10, 401, 10, 15, 8, 400, 11, 17, 5, 401, 10, 20, 2, 402, 11, 423, 11, 423, 11, 423, 11, 423, 12, 422, 13, 422, 13, 421, 14, 421, 14, 420, 15, 420, 14, 420, 15, 420, 15, 419, 16, 419, 16, 418, 17, 418, 17, 418, 17, 418, 17, 417, 18, 4, 4, 409, 24, 411, 23, 412, 22, 412, 22, 411, 23, 411, 22, 412, 22, 412, 22, 413, 21, 413, 20, 414, 20, 414, 19, 416, 18, 416, 17, 417, 17, 417, 17, 417, 16, 419, 14, 420, 14, 421, 13, 422, 11, 424, 10, 425, 8, 428, 4, 112407, 8, 422, 16, 416, 19, 414, 21, 2, 6, 405, 30, 403, 32, 402, 33, 401, 34, 400, 35, 400, 34, 400, 34, 401, 33, 402, 32, 277], u’size’: [434, 640]}
15
图一
图二
读coco数据集的代码接口了解segmentation的处理方法相关推荐
- coco数据集转voc格式(附pycocotools下载方法)
1.coco数据集高速下载 我下载的是train2017.val2017和annotations_trainval2017,即coco2017的训练集(118287张图片).测试集(5000张图片)和 ...
- 基于YOLOv5的手势识别系统(含手势识别数据集+训练代码)
基于YOLOv5的手势识别系统(含手势识别数据集+训练代码) 目录 基于YOLOv5的手势识别系统(含手势识别数据集+训练代码) 1. 前言 2. 手势识别的方法 (1)基于多目标检测的手势识别方法 ...
- 【数据集转换】VOC数据集转COCO数据集·代码实现+操作步骤
在自己的数据集上实验时,往往需要将VOC数据集转化为coco数据集,因为这种需求所以才记录这篇文章,代码出处未知,感谢开源. 在远程服务器上测试目标检测算法需要用到测试集,最常用的是coco2014/ ...
- COCO数据集可视化程序(包括bbox和segmentation)
出于项目需要,需要对COCO数据集做可视化,但网上没有比较好用的可视化方法,于是自己写了一个.分为三部分:1.标注框bbox的单独可视化,2.分割标注segmentation的单独可视化,3.bbox ...
- COCO 数据集的使用
Windows 10 编译 Pycocotools 踩坑记 COCO数据库简介 微软发布的COCO数据库, 除了图片以外还提供物体检测, 分割(segmentation)和对图像的语义文本描述信息. ...
- 对MS coco数据集的ann file协议的探究
文章目录 1. 工作场景 2. 资料收集 3. 解决方案 3.1 探究coco数据集中ann file 协议 3.1.1 annotations字段:重要程度☆☆☆ 3.1.2 images和cate ...
- COCO 数据集的使用,以及下载链接
转于:https://www.cnblogs.com/q735613050/p/8969452.html Windows 10 编译 Pycocotools 踩坑记 COCO数据库简介 一.下载链接 ...
- coco数据集大小分类_MicroSoft COCO数据集
安装错误 no such file or directory: 'pycocotools/_mask.c' 解决办法: pip install cython 评价标准 COCO数据集介绍 COCO数据 ...
- 【Detectron2】使用 Detectron2 训练基于 coco 数据集的目标检测网络
文章目录 一.安装 Detectron2 二.软连接 coco 数据集 三.训练 四.数据集相关参数 五.输出结果路径 六.COCO 数据集简介 七.模型相关参数 八.可视化结果 一.安装 Detec ...
- 【caffe-Windows】关于LSTM的使用-coco数据集
前言 建议大家使用Linux,因为Linux下的配置就没这么麻烦,各种make就行啦.Linux用户请绕道,因为此博客只针对Windows,可能比Linux麻烦很多倍. 在caffe-Windows新 ...
最新文章
- 做程序员的苦恼,智办事助力团队协作更简单
- .Net魔法堂:史上最全的ActiveX开发教程——发布篇
- 软工作业3:词频统计
- vim 的substitute
- 【转】王晟教授:给光纤3室研究生的一封公开信
- c语言程序滞留,c语言有个可以使程序延时的语句是什么?
- Network 第六篇 - 三层交换机配置路由功能
- CFS调度器的思想的新理解
- 如何在Ubuntu上安装MariaDB
- 双线macd指标参数最佳设置_MACD“双线合一”抄底法:等待个股最佳买点的出现,及时买进...
- JavaScript表单编程
- 用正则表达式去除标点符号
- 保存Windows11聚焦图片教程
- 区块链开发语言python_区块链开发语言有哪些?哪种语言更适合区块链开发?
- 计算机配置更新很长时间没反应,电脑安装更新时间过长怎么办
- 《编码隐藏在计算机软硬件背后的语言》读感
- 异常:com.alibaba.druid.sql.parser.ParserException: ERROR. token : DESC, pos : 72
- 虞美人盛开的山坡片尾曲_さよならの夏_离别的夏天_歌词_带假名及翻译
- 在PHP中全面阻止SQL注入式攻击之三
- MySQL三种插入方式