背景

下面的代码是将INRIA数据集转换成VOC格式。图片是614张。
其中，使用OinginImage文件夹里的Train中Pos和Annotations作为训练，利用Test里的Pos作为测试。

# -*- coding: UTF-8 -*-
from xml.dom.minidom import Document
import os
import relist = os.listdir("annotations")
savePath = 'Annotations'
for oldfilename in list:if str(".txt") not in oldfilename:continueprint oldfilename#raw_input(unicode('按回车键退出...','utf-8').encode('gbk'))fileindex = re.findall('\d+', oldfilename)print fileindex#raw_input(unicode('按回车键退出...','utf-8').encode('gbk'))print str(int(fileindex[0]))#raw_input(unicode('按回车键退出...','utf-8').encode('gbk'))newfilename = os.path.splitext(oldfilename)[0] + ".xml"#print newfilename#raw_input(unicode('按回车键退出...','utf-8').encode('gbk'))f = open(os.path.join("annotations",oldfilename), "r")print 'processing:' + f.namedoc = Document()annotation = doc.createElement('annotation')doc.appendChild(annotation)folder = doc.createElement('folder')folder.appendChild(doc.createTextNode('VOC2007'))annotation.appendChild(folder)filename = doc.createElement('filename')filename.appendChild(doc.createTextNode(oldfilename))annotation.appendChild(filename)source = doc.createElement('source')annotation.appendChild(source)database = doc.createElement('database')database.appendChild(doc.createTextNode('PASperson Database'))source.appendChild(database)annotation1 = doc.createElement('annotation')annotation1.appendChild(doc.createTextNode('PASperson'))source.appendChild(annotation1)fr = f.readlines()  # 调用文件的 readline()方法一次读取for line in fr:if str(line).__contains__("size"):sizes = []sizes = re.findall('\d+', line)size = doc.createElement('size')annotation.appendChild(size)width = doc.createElement('width')width.appendChild(doc.createTextNode(sizes[0]))size.appendChild(width)height = doc.createElement('height')height.appendChild(doc.createTextNode(sizes[1]))size.appendChild(height)depth = doc.createElement('depth')depth.appendChild(doc.createTextNode(sizes[2]))size.appendChild(depth)segmented = doc.createElement('segmented')segmented.appendChild(doc.createTextNode('0'))annotation.appendChild(segmented)if (str(line).__contains__('Objects')):nums = re.findall('\d+', line)breakfor index in range(1, int(nums[0])+1):for line in fr:if str(line).__contains__("Bounding box for object " + str(index)):coordinate = re.findall('\d+', line)object = doc.createElement('object')annotation.appendChild(object)name = doc.createElement('name')name.appendChild(doc.createTextNode('person'))object.appendChild(name)pose = doc.createElement('pose')pose.appendChild(doc.createTextNode('Unspecified'))object.appendChild(pose)truncated = doc.createElement('truncated')truncated.appendChild(doc.createTextNode('0'))object.appendChild(truncated)difficult = doc.createElement('difficult')difficult.appendChild(doc.createTextNode('0'))object.appendChild(difficult)bndbox = doc.createElement('bndbox')object.appendChild(bndbox)#数字中包含序号，下标应从1开始xmin = doc.createElement('xmin')xmin.appendChild(doc.createTextNode(coordinate[1]))bndbox.appendChild(xmin)ymin = doc.createElement('ymin')ymin.appendChild(doc.createTextNode(coordinate[2]))bndbox.appendChild(ymin)xmax = doc.createElement('xmax')xmax.appendChild(doc.createTextNode(coordinate[3]))bndbox.appendChild(xmax)ymax = doc.createElement('ymax')ymax.appendChild(doc.createTextNode(coordinate[4]))bndbox.appendChild(ymax)f.close()f = open(os.path.join(savePath,newfilename), 'w')f.write(doc.toprettyxml(indent="\t"))f.close()print str(fileindex) + " compelete"print 'process compelete'

INRIA数据集转换成VOC格式相关推荐

DAGM2007数据集转换成VOC格式
DAGM2007数据集-to-缺陷数据集VOC格式 DAGM2007数据集下载数据集简单介绍转换代码转换结果 DAGM2007数据集下载链接: DAGM2007. 下载后把每一类的压缩包解压放 ...
Caltech 数据集转换成VOC格式
注意本人用pycharm 运行环境是python2.7.14 如果是python3 可能会存在错误 1. Seq文件转化成JEPG图像文件() 运行seq2jpg.py文件,输入.seq文件夹,输出到 ...
SSD学习系列（二）LMDB概念以及将VOC数据集转换成LMDB格式
LMDB概念接上一篇,将SSDa caffen框架代码编译通过后y,接下来就可以利用其脚本将标注好的VOC格式数据集转换成LMDB格式,以便开始SSD模型训练. LMDB是Lightning Mem ...
数据集转换成LMDB格式
文章来源https://www.cnblogs.com/dengshunge/p/10841108.html略加修改和补充. 介绍两种LMDB格式数据集的生成,一种是自己的数据集,一种是下载wider ...
CrowdHuman数据集转成VOC格式并训练模型
CrowdHuman数据集转成VOC格式并训练模型 1. 介绍 The CrowdHuman dataset is large, rich-annotated and contains high di ...
TT100K数据集转换成coco格式，并重新划分
TT100K数据集转换成coco格式,并重新划分统计每个类别 import os import jsonos.makedirs('annotations',exist_ok=True) #存放数据的 ...
BDD100K数据集简单解析以及格式转换成voc格式
https://blog.csdn.net/qq583083658/article/details/86493752 BDD100K数据集之数据集下载 https://blog.csdn.net/qq ...
【目标检测】TT100K数据集使用，提取标注信息并转换成VOC格式的xml文件或yolo格式的txt文件
1 TT100K 官网 TT100K官网 1.1 数据集介绍本人下载的是2021的数据集,训练集 6105张图片, 测试集 3071 张图片,每张图片的分辨率为2048 * 2048,共有232 种 ...
小技巧（5）：将TT100K数据集转成VOC格式，并且用Python脚本选出45类超过100张的图片和XML
上一篇:小技巧(4):将txt中的某两列数据写入csv文件中,制作图像分类标签文章目录一.相关准备 1.1 下载数据集 1.2 下载代码文件 1.3 将相关文件移入代码文件二.创建标准的VOC文 ...

INRIA数据集转换成VOC格式

背景

INRIA数据集转换成VOC格式相关推荐

最新文章

热门文章