读coco数据集的代码接口了解segmentation的处理方法

COCO数据集是微软团队制作的一个数据集,通过这个数据集我们可以训练到神经网络对图像进行detection,classification,segmentation,captioning。具体介绍请祥见官网。

  • annotation格式介绍
  • mask存储处理方式简单介绍
  • 相关代码分析
  • 一个实例

annotation格式介绍

//从官网拷贝下来的
{"info": info,"images": [image],"annotations": [annotation],"licenses": [license],
}info{"year": int,"version": str,"description": str,"contributor": str,"url": str,"date_created": datetime,
}image{"id": int,"width": int,"height": int,"file_name": str,"license": int,"flickr_url": str,"coco_url": str,"date_captured": datetime,
}license{"id": int,"name": str,"url": str,
}
----------

Object Instance Annotations

Each instance annotation contains a series of fields, including the category id and segmentation mask of the object. The segmentation format depends on whether the instance represents a single object (iscrowd=0 in which case polygons are used) or a collection of objects (iscrowd=1 in which case RLE is used). Note that a single object (iscrowd=0) may require multiple polygons, for example if occluded. Crowd annotations (iscrowd=1) are used to label large groups of objects (e.g. a crowd of people). In addition, an enclosing bounding box is provided for each object (box coordinates are measured from the top left image corner and are 0-indexed). Finally, the categories field of the annotation structure stores the mapping of category id to category and supercategory names.

中文翻译如下: 每个实例注释包含一系列字段,这些字段有category id和segmentation mask。segementation字段的格式取决于实例是代表单个物体(具体来说iscrowd=0,这时候就会用到polygon,也就是多边形)还是目标的集合体(此时iscrowd=1, 会用到RLE,后面解释这个的意思)。注意到单个目标可能需要多个多边形来表示,例如在被遮挡的情况下。群体注释是用来标注目标的集合体(例如一群人)。除此之外,每个目标都会有一个封闭的外接矩形框来标记(矩形框的坐标从图像的左上角开始记录,没有索引)。最后,类别字段存储着category id到category和父级category名字的映射。

annotation{"id": int,"image_id": int,"category_id": int,"segmentation": RLE or [polygon],"area": float,"bbox": [x,y,width,height],"iscrowd": 0 or 1,
}categories[{"id": int,"name": str,"supercategory": str,
}]

mask存储处理方式简单介绍

上面提到coco数据集使用了两种方式进行mask存储,一是polygon,一是RLE。polygon比较好理解,就是多边形嘛!RLE是什么呢?

简单点来讲,RLE是一种压缩方法,也是最容易想到的压缩方式。

举个例子:M = [0,0,0,1,1,1,1,1,1,0,0],则M的RLE编码为[3,6,2],当然这是针对二进制进行的编码,也是coco里面采用的。RLE远不止这样简单,我们这里并不着重讲RLE,请百度吧。

代码中注释说的

# RLE is a simple yet efficient format for storing binary masks. RLE
# first divides a vector (or vectorized image) into a series of piecewise
# constant regions and then for each piece simply stores the length of
# that piece. For example, given M=[0 0 1 1 1 0 1] the RLE counts would
# be [2 3 1 1], or for M=[1 1 1 1 1 1 0] the counts would be [0 6 1]
# (note that the odd counts are always the numbers of zeros). Instead of
# storing the counts directly, additional compression is achieved with a
# variable bitrate representation based on a common scheme called LEB128.

解释一下就是:RLE将一个二进制向量分成一系列固定长度的片段,对每个片段只存储那个片段的长度。例如M=[0 0 1 1 1 0 1], RLE就是[2 3 1 1];M=[1 1 1 1 1 1 0], RLE为[0 6 1],注意奇数位始终为0的个数。另外,也使用一个基于LEB128的通用方案的可变比特率来完成额外的压缩。

相关代码分析

COCO是官方给出的一个api接口,具体来说是一个python和C编写的工具代码。mask相关内容是用c编写的。//代码来源于FastMaskRCNN**1.convert img and annotation to TFRecord**
//加载标注文件
coco = COCO(annFile)
//加载类别信息
cats = coco.loadCats(coco.getCatIds())
print ('%s has %d images' %(split_name, len(coco.imgs)))
//将img信息转存
imgs = [(img_id, coco.imgs[img_id]) for img_id in coco.imgs]
//获取分片信息
num_shards = int(len(imgs) / 2500)
num_per_shard = int(math.ceil(len(imgs) / float(num_shards)))2.获取coco中的mask,bbox信息
def _get_coco_masks(coco, img_id, height, width, img_name):""" get the masks for all the instancesNote: some images are not annotatedReturn:masks, mxhxw numpy arrayclasses, mx1bboxes, mx4"""annIds = coco.getAnnIds(imgIds=[img_id], iscrowd=None)# assert  annIds is not None and annIds > 0, 'No annotaion for %s' % str(img_id)anns = coco.loadAnns(annIds)coco.showAnns(anns)# assert len(anns) > 0, 'No annotaion for %s' % str(img_id)masks = []classes = []bboxes = []mask = np.zeros((height, width), dtype=np.float32)segmentations = []for ann in anns:m = coco.annToMask(ann) # zero one maskassert m.shape[0] == height and m.shape[1] == width, \'image %s and ann %s dont match' % (img_id, ann)masks.append(m)cat_id = _cat_id_to_real_id(ann['category_id'])classes.append(cat_id)bboxes.append(ann['bbox'])m = m.astype(np.float32) * cat_idmask[m > 0] = m[m > 0]masks = np.asarray(masks)classes = np.asarray(classes)bboxes = np.asarray(bboxes)# to x1, y1, x2, y2if bboxes.shape[0] <= 0:bboxes = np.zeros([0, 4], dtype=np.float32)classes = np.zeros([0], dtype=np.float32)print ('None Annotations %s' % img_name)LOG('None Annotations %s' % img_name)bboxes[:, 2] = bboxes[:, 0] + bboxes[:, 2]bboxes[:, 3] = bboxes[:, 1] + bboxes[:, 3]gt_boxes = np.hstack((bboxes, classes[:, np.newaxis]))gt_boxes = gt_boxes.astype(np.float32)masks = masks.astype(np.uint8)mask = mask.astype(np.uint8)assert masks.shape[0] == gt_boxes.shape[0], 'Shape Error'return gt_boxes, masks, mask

一个实例

# get all images containing given categories, select one at random
catIds = coco.getCatIds(catNms=['animal']);
imgIds = coco.getImgIds(catIds=catIds );
imgIds = coco.getImgIds(imgIds = [324139])
img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]print imgIds
print img['coco_url']
I = io.imread(img['coco_url'])
plt.axis('off')
plt.imshow(I)
plt.show()  //图一plt.imshow(I); plt.axis('off')
annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco.loadAnns(annIds)
print len(anns)
masks = []
showonce = True
for ann in anns:if type(ann['segmentation']) == list and showonce:print ann['segmentation']showonce = Falseif type(ann['segmentation']) != list:print ann['segmentation']m = coco.annToMask(ann)masks.append(m)
print len(masks)
coco.showAnns(anns) //图二

15
[[151.06, 113.6, 168.95, 102.49, 182.53, 92.62, 193.64, 80.28, 203.51, 70.4, 208.45, 61.76, 206.6, 53.74, 209.68, 49.42, 220.17, 50.04, 220.79, 55.59, 222.64, 59.3, 222.64, 59.91, 227.58, 66.7, 228.2, 77.19, 228.2, 83.98, 228.2, 87.06, 228.2, 92.0, 227.58, 96.32, 220.79, 101.87, 213.39, 104.96, 205.36, 111.75, 202.9, 113.6, 201.04, 114.22, 202.9, 123.47, 200.43, 129.64, 200.43, 125.94, 198.58, 113.6, 190.55, 111.75, 181.91, 113.6, 168.95, 114.83, 168.95, 114.83, 162.17, 120.39, 157.23, 119.15, 146.74, 117.92, 142.42, 115.45]]
{u’counts’: [113441, 1, 423, 6, 427, 7, 3, 1, 422, 8, 3, 1, 421, 9, 2, 2, 420, 10, 2, 1, 421, 10, 1, 2, 419, 12, 1, 2, 418, 15, 419, 15, 418, 16, 418, 16, 418, 15, 419, 15, 418, 16, 417, 16, 418, 16, 418, 15, 419, 14, 419, 14, 419, 14, 420, 13, 420, 13, 421, 11, 422, 11, 423, 10, 423, 10, 424, 9, 425, 8, 426, 7, 427, 6, 427, 5, 429, 5, 429, 5, 429, 5, 429, 5, 428, 5, 429, 5, 428, 6, 428, 6, 428, 6, 428, 5, 429, 5, 429, 5, 408, 7, 14, 5, 407, 9, 13, 5, 406, 11, 11, 5, 407, 12, 9, 6, 406, 13, 9, 6, 404, 15, 8, 7, 402, 18, 7, 7, 401, 20, 6, 6, 401, 21, 6, 6, 400, 34, 399, 35, 399, 12, 1, 22, 398, 13, 3, 19, 399, 12, 6, 17, 399, 12, 8, 15, 399, 12, 10, 13, 400, 10, 13, 10, 401, 10, 15, 8, 400, 11, 17, 5, 401, 10, 20, 2, 402, 11, 423, 11, 423, 11, 423, 11, 423, 12, 422, 13, 422, 13, 421, 14, 421, 14, 420, 15, 420, 14, 420, 15, 420, 15, 419, 16, 419, 16, 418, 17, 418, 17, 418, 17, 418, 17, 417, 18, 4, 4, 409, 24, 411, 23, 412, 22, 412, 22, 411, 23, 411, 22, 412, 22, 412, 22, 413, 21, 413, 20, 414, 20, 414, 19, 416, 18, 416, 17, 417, 17, 417, 17, 417, 16, 419, 14, 420, 14, 421, 13, 422, 11, 424, 10, 425, 8, 428, 4, 112407, 8, 422, 16, 416, 19, 414, 21, 2, 6, 405, 30, 403, 32, 402, 33, 401, 34, 400, 35, 400, 34, 400, 34, 401, 33, 402, 32, 277], u’size’: [434, 640]}
15


图一

图二

读coco数据集的代码接口了解segmentation的处理方法相关推荐

  1. coco数据集转voc格式(附pycocotools下载方法)

    1.coco数据集高速下载 我下载的是train2017.val2017和annotations_trainval2017,即coco2017的训练集(118287张图片).测试集(5000张图片)和 ...

  2. 基于YOLOv5的手势识别系统(含手势识别数据集+训练代码)

    基于YOLOv5的手势识别系统(含手势识别数据集+训练代码) 目录 基于YOLOv5的手势识别系统(含手势识别数据集+训练代码) 1. 前言 2. 手势识别的方法 (1)基于多目标检测的手势识别方法 ...

  3. 【数据集转换】VOC数据集转COCO数据集·代码实现+操作步骤

    在自己的数据集上实验时,往往需要将VOC数据集转化为coco数据集,因为这种需求所以才记录这篇文章,代码出处未知,感谢开源. 在远程服务器上测试目标检测算法需要用到测试集,最常用的是coco2014/ ...

  4. COCO数据集可视化程序(包括bbox和segmentation)

    出于项目需要,需要对COCO数据集做可视化,但网上没有比较好用的可视化方法,于是自己写了一个.分为三部分:1.标注框bbox的单独可视化,2.分割标注segmentation的单独可视化,3.bbox ...

  5. COCO 数据集的使用

    Windows 10 编译 Pycocotools 踩坑记 COCO数据库简介 微软发布的COCO数据库, 除了图片以外还提供物体检测, 分割(segmentation)和对图像的语义文本描述信息. ...

  6. 对MS coco数据集的ann file协议的探究

    文章目录 1. 工作场景 2. 资料收集 3. 解决方案 3.1 探究coco数据集中ann file 协议 3.1.1 annotations字段:重要程度☆☆☆ 3.1.2 images和cate ...

  7. COCO 数据集的使用,以及下载链接

    转于:https://www.cnblogs.com/q735613050/p/8969452.html Windows 10 编译 Pycocotools 踩坑记 COCO数据库简介 一.下载链接 ...

  8. coco数据集大小分类_MicroSoft COCO数据集

    安装错误 no such file or directory: 'pycocotools/_mask.c' 解决办法: pip install cython 评价标准 COCO数据集介绍 COCO数据 ...

  9. 【Detectron2】使用 Detectron2 训练基于 coco 数据集的目标检测网络

    文章目录 一.安装 Detectron2 二.软连接 coco 数据集 三.训练 四.数据集相关参数 五.输出结果路径 六.COCO 数据集简介 七.模型相关参数 八.可视化结果 一.安装 Detec ...

  10. 【caffe-Windows】关于LSTM的使用-coco数据集

    前言 建议大家使用Linux,因为Linux下的配置就没这么麻烦,各种make就行啦.Linux用户请绕道,因为此博客只针对Windows,可能比Linux麻烦很多倍. 在caffe-Windows新 ...

最新文章

  1. 做程序员的苦恼,智办事助力团队协作更简单
  2. .Net魔法堂:史上最全的ActiveX开发教程——发布篇
  3. 软工作业3:词频统计
  4. vim 的substitute
  5. 【转】王晟教授:给光纤3室研究生的一封公开信
  6. c语言程序滞留,c语言有个可以使程序延时的语句是什么?
  7. Network 第六篇 - 三层交换机配置路由功能
  8. CFS调度器的思想的新理解
  9. 如何在Ubuntu上安装MariaDB
  10. 双线macd指标参数最佳设置_MACD“双线合一”抄底法:等待个股最佳买点的出现,及时买进...
  11. JavaScript表单编程
  12. 用正则表达式去除标点符号
  13. 保存Windows11聚焦图片教程
  14. 区块链开发语言python_区块链开发语言有哪些?哪种语言更适合区块链开发?
  15. 计算机配置更新很长时间没反应,电脑安装更新时间过长怎么办
  16. 《编码隐藏在计算机软硬件背后的语言》读感
  17. 异常:com.alibaba.druid.sql.parser.ParserException: ERROR. token : DESC, pos : 72
  18. 虞美人盛开的山坡片尾曲_さよならの夏_离别的夏天_歌词_带假名及翻译
  19. 在PHP中全面阻止SQL注入式攻击之三
  20. MySQL三种插入方式

热门文章

  1. Java贪吃蛇(附完整代码下载链接)-跟随狂神一天完成
  2. pocket英语语法入门
  3. 项目管理术语英汉对照表
  4. 计算方法(三)分段线性插值和Hermite插值
  5. 路由表的下一跳地址如何计算
  6. Android用浏览器打开pdf文件
  7. AUTOCAD——坐标标注
  8. 相机靶面尺寸详解+工业相机选型
  9. 打开CMD的方式及常用的DOS命令
  10. VOA special English 下载 py