目标检测voc转coco改良版

《目标检测voc转coco改良版》

我训练目标检测的一般流程：1、labelme标注；2、labelme转voc，对img和voc进行数据增广；3、voc转coco；4.计算图片的RGB的均值和标准差用以标准化。其中voc转coco这部分跑开源程序的时候很方便，所以记录在网上找到的一段代码，然后改良了一下，方便以后使用。

Key Words：目标检测、voc转coco

Beijing, 2020

作者：RaySue

Agile Pioneer

网上搜到程序，之前有几个问题如下，已经改好

get_filename_as_int需要文件名必须是数字，用以生成唯一id
如果voc的filename没有图片后缀没有提示
生成的json文件成一坨了

import sys
import os
import json
import warnings
import numpy as np
import xml.etree.ElementTree as ET
import globSTART_BOUNDING_BOX_ID = 1
# 按照你给定的类别来生成你的 category_id
# COCO 默认 0 是背景类别
# CenterNet 里面类别是从0开始的，否则生成heatmap的时候报错
PRE_DEFINE_CATEGORIES = {"person": 1, "car": 2}
START_IMAGE_ID = 0# If necessary, pre-define category and its id
#  PRE_DEFINE_CATEGORIES = {"aeroplane": 1, "bicycle": 2, "bird": 3, "boat": 4,
#  "bottle":5, "bus": 6, "car": 7, "cat": 8, "chair": 9,
#  "cow": 10, "diningtable": 11, "dog": 12, "horse": 13,
#  "motorbike": 14, "person": 15, "pottedplant": 16,
#  "sheep": 17, "sofa": 18, "train": 19, "tvmonitor": 20}def get(root, name):vars = root.findall(name)return varsdef get_and_check(root, name, length):vars = root.findall(name)if len(vars) == 0:raise ValueError("Can not find %s in %s." % (name, root.tag))if length > 0 and len(vars) != length:raise ValueError("The size of %s is supposed to be %d, but is %d."% (name, length, len(vars)))if length == 1:vars = vars[0]return varsdef get_filename_as_int(filename):try:filename = filename.replace("\\", "/")filename = os.path.splitext(os.path.basename(filename))[0]return int(filename)except:# raise ValueError("Filename %s is supposed to be an integer." % (filename))image_id = np.array([ord(char) % 10000 for char in filename], dtype=np.int32).sum()# print(image_id)return 0def get_categories(xml_files):"""Generate category name to id mapping from a list of xml files.Arguments:xml_files {list} -- A list of xml file paths.Returns:dict -- category name to id mapping."""classes_names = []for xml_file in xml_files:tree = ET.parse(xml_file)root = tree.getroot()for member in root.findall("object"):classes_names.append(member[0].text)classes_names = list(set(classes_names))classes_names.sort()return {name: i for i, name in enumerate(classes_names)}def convert(xml_files, json_file):json_dict = {"images": [], "type": "instances", "annotations": [], "categories": []}if PRE_DEFINE_CATEGORIES is not None:categories = PRE_DEFINE_CATEGORIESelse:categories = get_categories(xml_files)bnd_id = START_BOUNDING_BOX_IDimage_id = START_IMAGE_IDfor xml_file in xml_files:tree = ET.parse(xml_file)root = tree.getroot()path = get(root, "path")if len(path) == 1:filename = os.path.basename(path[0].text)elif len(path) == 0:filename = get_and_check(root, "filename", 1).textelse:raise ValueError("%d paths found in %s" % (len(path), xml_file))## The filename must be a number# image_id = get_filename_as_int(filename)size = get_and_check(root, "size", 1)width = int(get_and_check(size, "width", 1).text)height = int(get_and_check(size, "height", 1).text)if ".jpg" not in filename or ".png" not in filename:filename = filename + ".jpg"warnings.warn("filename's default suffix is jpg")images = {"file_name": filename,  # 图片名"height": height,"width": width,"id": image_id,  # 图片的ID编号（每张图片ID是唯一的）}json_dict["images"].append(images)## Currently we do not support segmentation.#  segmented = get_and_check(root, 'segmented', 1).text#  assert segmented == '0'for obj in get(root, "object"):category = get_and_check(obj, "name", 1).textif category not in categories:new_id = len(categories)categories[category] = new_idcategory_id = categories[category]bndbox = get_and_check(obj, "bndbox", 1)xmin = int(get_and_check(bndbox, "xmin", 1).text) - 1ymin = int(get_and_check(bndbox, "ymin", 1).text) - 1xmax = int(get_and_check(bndbox, "xmax", 1).text)ymax = int(get_and_check(bndbox, "ymax", 1).text)assert xmax > xminassert ymax > ymino_width = abs(xmax - xmin)o_height = abs(ymax - ymin)ann = {"area": o_width * o_height,"iscrowd": 0,"image_id": image_id,  # 对应的图片ID（与images中的ID对应）"bbox": [xmin, ymin, o_width, o_height],"category_id": category_id,"id": bnd_id, # 同一张图片可能对应多个 ann"ignore": 0,"segmentation": [],}json_dict["annotations"].append(ann)bnd_id = bnd_id + 1image_id += 1for cate, cid in categories.items():cat = {"supercategory": "none", "id": cid, "name": cate}json_dict["categories"].append(cat)os.makedirs(os.path.dirname(json_file), exist_ok=True)json.dump(json_dict, open(json_file, 'w'), indent=4)if __name__ == "__main__":# import argparse# parser = argparse.ArgumentParser(#     description="Convert Pascal VOC annotation to COCO format."# )# parser.add_argument("xml_dir", help="Directory path to xml files.", type=str)# parser.add_argument("json_file", help="Output COCO format json file.", type=str)# args = parser.parse_args()# args.xml_dir# args.json_filexml_dir = "/voc_aug"json_file = "./train.json"  # output jsonxml_files = glob.glob(os.path.join(xml_dir, "*.xml"))# If you want to do train/test split, you can pass a subset of xml files to convert function.print("Number of xml files: {}".format(len(xml_files)))convert(xml_files, json_file)print("Success: {}".format(json_file))

参考

https://github.com/shiyemin/voc2coco/blob/master/voc2coco.py

https://www.cnblogs.com/leebxo/p/10607955.html#%E5%B0%86labelme%E7%9A%84json%E8%BD%AC%E6%88%90coco%E6%A0%BC%E5%BC%8Fjson

目标检测voc转coco改良版相关推荐

Cascade R-CNN升级！目标检测制霸COCO，实例分割超越Mask R-CNN
点击我爱计算机视觉标星,更快获取CVML新技术前天,arxiv上新出一篇论文<Cascade R-CNN: High Quality Object Detection and Instance ...
目标检测-VOC数据集txt文件制作方法
个人微信公众号:AI研习图书馆,欢迎关注~ 深度学习知识及资源分享,学习交流,共同进步~ VOC数据集中txt文件的制作方法 1.引言本文介绍两种VOC数据集txt文件生成方法,一种是Python实 ...
【机器学习】 - 目标检测 - VOC格式数据集介绍与自己制作
一.VOC数据集 PASCAL VOC 挑战赛主要有 Object Classification .Object Detection.Object Segmentation.Human Layout. ...
tensorflow精进之路(二十四)——Object Detection API目标检测(中)（COCO数据集训练的模型—ssd_mobilenet_v1_coco模型）
1.概述上一讲简单的讲了目标检测的原理以及Tensorflow Object Detection API的安装,这一节继续讲Tensorflow Object Detection API怎么用. 2 ...
python实现目标检测voc格式标签数据增强
文章目录前言一.显示图片(可关闭) 二.创建图像变换的类 1.增强数据代码 2.图像加噪声 3.调整图像亮度 4.添加黑色像素块 5.旋转图像 6.图像裁剪 7.平移图像 8.图像镜像 9.图像随 ...
【自制数据集自动标注】yolo目标检测 voc格式单调无遮挡背景单个物体自制数据集自动标注
垃圾分类目标检测数据集准备数据集背景: 参加全国大学生工程训练综合能力竞赛智能生活垃圾分类赛道时深感采集制作数据集过分彰显"有多少人工,就有多少智能"的惨痛,为了不辛苦麻烦身边小 ...
史上最强！目标检测数据集标注工具网页版
前言相信做目标检测的大家都有过会在将会有制作自己的目标检测数据集的需求.标注数据就得有相应的工具,这里就提供了这样一个标注工具.这个标注工具有着独一无二的特点,它是基于浏览器的标注工具.这就意味着任 ...
谷歌简单粗暴“复制-粘贴”数据增广，刷新COCO目标检测与实例分割新高度
点击上方"小白学视觉",选择加"星标"或"置顶" 重磅干货,第一时间送达近日,谷歌.UC伯克利与康奈尔大学的研究人员公布了一篇论文 Sim ...
什么是目标检测？理论+实操（github全面解析）？（持续更新中）
温馨提示:文章内容完整但是过长,由于前后内容有关联,读者学习可以多开几个浏览器分屏有助于定位目录目标检测理论部分: 1.目标检测介绍 2.YOLOv5的检测原理 3.目标检测的意义 4.目标检测的 ...

目标检测voc转coco改良版

参考

目标检测voc转coco改良版相关推荐

最新文章

热门文章