【Caffe-Ubuntu】JSON 标签生成自己的 Caffe-LMDB 数据文件

0：生成 LMDB 的流程

已有的 json 数据集，可以通过 labelme 等开源工具标注，或者自己写脚本生成
将 json 文件转成 voc2007 格式的文件（labelme 格式转 VOC2007 数据集格式）
设置个人的 labelmap.prototxt
借用ssd-caffe的 create_list.sh 脚本生成待转写格式文件
借用ssd-caffe的 create_data.sh 脚本生成 LMDB 格式文件

1：制作自己的数据集

这里推荐几款好用的标注工具
6. labelme：安装简单，支持标定关键点，分割等，非常好用，生成json格式的标签文件。
格式：下面列出的标签参数都是必须的，否则labelme无法正常识别

{"shapes": [{"shape_type": "polygon", "line_color": null, "points": [[ 634,  276  ], [ 703,   275  ], [ 705,   312  ], [ 635,   313  ]], "fill_color": null, "label": "traffic-4"}, {"shape_type": "polygon", "line_color": null, "points": [[ 715,  275  ], [ 785,  274  ], [ 786,  313  ], [ 716,  312  ]], "fill_color": null, "label": "traffic-4-occ-largely"}], "lineColor": [ 0,  255,   0,  128 ], "imagePath": "2012-3-23_20-23-25_0.jpg", "fillColor": [ 255,   0,  0,  128  ], "imageData": null
}

labelImg：安装简单，非常方便的画框的标定工具。
支持PASCAL VOC格式的XML标签。
其他：待续。。。

2 ：JSON 2 VOC2007

# -*- coding: utf-8 -*-
import os, re
import json
import cv2
import numpy as np
import codecs
from glob import glob
import shutil
from sklearn.model_selection import train_test_splitdef iter_files(data_root_path, saved_path):count = 0for root,dirs,files in os.walk(data_root_path):for json_file in files:if re.search(".json", json_file):file_name = json_file[0:json_file.find(".json")]file_path = os.path.join(root, json_file)count += 1print("====================================================================")print(count)print(file_path)# json 转 voc2007json2voc2007(file_name, root, saved_path)for dirname in dirs:iter_files(dirname, saved_path)def json2voc2007(json_file_,labelme_path,saved_path):json_filename = os.path.join(labelme_path, json_file_ + ".json")json_file = json.load(open(json_filename, "r"))height, width, channels = cv2.imread(os.path.join(labelme_path, json_file_ + ".jpg")).shapewith codecs.open(saved_path + "Annotations/" + json_file_ + ".xml", "w", "utf-8") as xml:xml.write('<annotation>\n')xml.write('\t<folder>' + 'TrafficSign' + '</folder>\n')xml.write('\t<filename>' + json_file_ + ".jpg" + '</filename>\n')xml.write('\t<source>\n')xml.write('\t\t<database>The UAV autolanding</database>\n')xml.write('\t\t<annotation>UAV AutoLanding</annotation>\n')xml.write('\t\t<image>flickr</image>\n')xml.write('\t\t<flickrid>NULL</flickrid>\n')xml.write('\t</source>\n')xml.write('\t<owner>\n')xml.write('\t\t<flickrid>NULL</flickrid>\n')xml.write('\t\t<name>TrafficSign</name>\n')xml.write('\t</owner>\n')xml.write('\t<size>\n')xml.write('\t\t<width>' + str(width) + '</width>\n')xml.write('\t\t<height>' + str(height) + '</height>\n')xml.write('\t\t<depth>' + str(channels) + '</depth>\n')xml.write('\t</size>\n')xml.write('\t\t<segmented>0</segmented>\n')for multi in json_file["shapes"]:label = multi["label"]# 下面这个if是我添加的一个标签筛选的判断if label == "traffic-3" or \label == "traffic-3-occ-partially":points = np.array(multi["points"])xmin = min(points[:, 0])xmax = max(points[:, 0])ymin = min(points[:, 1])ymax = max(points[:, 1])if xmax <= xmin:passelif ymax <= ymin:passelse:xml.write('\t<object>\n')xml.write('\t\t<name>' + label + '</name>\n')xml.write('\t\t<pose>Unspecified</pose>\n')xml.write('\t\t<truncated>1</truncated>\n')xml.write('\t\t<difficult>0</difficult>\n')xml.write('\t\t<bndbox>\n')xml.write('\t\t\t<xmin>' + str(xmin) + '</xmin>\n')xml.write('\t\t\t<ymin>' + str(ymin) + '</ymin>\n')xml.write('\t\t\t<xmax>' + str(xmax) + '</xmax>\n')xml.write('\t\t\t<ymax>' + str(ymax) + '</ymax>\n')xml.write('\t\t</bndbox>\n')xml.write('\t</object>\n')print(json_filename, xmin, ymin, xmax, ymax, label)xml.write('</annotation>')# 5.复制图片到 VOC2007/JPEGImages/下image = glob(labelme_path + "/" + json_file_ + ".jpg")print("copy image files to VOC007/JPEGImages/")shutil.copyfile(image[0], saved_path + "JPEGImages/" + json_file_ + ".jpg")# 6.split files for txttxtsavepath = saved_path + "ImageSets/Main/"ftrainval = open(txtsavepath + '/trainval.txt', 'w')ftest = open(txtsavepath + '/test.txt', 'w')ftrain = open(txtsavepath + '/train.txt', 'w')fval = open(txtsavepath + '/val.txt', 'w')total_files = glob("./TrafficSign/Annotations/*.xml")total_files = [i.split("/")[-1].split(".xml")[0] for i in total_files]# test_filepath = ""for file in total_files:ftrainval.write(file + "\n")# test# for file in os.listdir(test_filepath):#    ftest.write(file.split(".jpg")[0] + "\n")# split# test_size 设置train：val的划分比例train_files, val_files = train_test_split(total_files, test_size=0.10, random_state=42)# trainfor file in train_files:ftrain.write(file + "\n")# valfor file in val_files:fval.write(file + "\n")ftrainval.close()ftrain.close()fval.close()# ftest.close()def main():# 1.标签路径saved_path = "./VOC2007/"  # 保存路径# 2.创建要求文件夹if not os.path.exists(saved_path + "Annotations"):os.makedirs(saved_path + "Annotations")if not os.path.exists(saved_path + "JPEGImages/"):os.makedirs(saved_path + "JPEGImages/")if not os.path.exists(saved_path + "ImageSets/Main/"):os.makedirs(saved_path + "ImageSets/Main/")data_root_path = "./data/"# 3. 迭代查询各级文件夹iter_files(data_root_path, saved_path)if __name__ == '__main__':main()

3：labelmap.prototxt 设定（以背景，目标两类为例）

item {
name: "none_of_the_above"
label: 0
display_name: "background"
}
item {
name: "face"
label: 1
display_name: "face"
}

4：create_list.sh

这里主要注意，root_dir 要修改为自己 VOC2007 的路径

#!/bin/bash
root_dir=$HOME/data/VOC2007
sub_dir=ImageSets/Main
echo $(dirname "${BASH_SOURCE[0]}")
bash_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
for dataset in  val
dodst_file=$bash_dir/$dataset.txtif [ -f $dst_file ]thenrm -f $dst_filefiecho "Create list for $dataset..."echo $root_dir/$sub_dir/$dataset.txtdataset_file=$root_dir/$sub_dir/$dataset.txtimg_file=$bash_dir/$dataset"_img.txt"cp $dataset_file $img_filesed -i "s/^/\/JPEGImages\//g" $img_filesed -i "s/$/.jpg/g" $img_filelabel_file=$bash_dir/$dataset"_label.txt"cp $dataset_file $label_filesed -i "s/^/\/Annotations\//g" $label_filesed -i "s/$/.xml/g" $label_filepaste -d' ' $img_file $label_file >> $dst_filerm -f $label_filerm -f $img_file# Generate image name and size infomation.if [ $dataset == "val" ]then$bash_dir/../../build/tools/get_image_size $root_dir $dst_file $bash_dir/$dataset"_name_size.txt"fi# Shuffle trainval file.if [ $dataset == "train" ]thenrand_file=$dst_file.randomcat $dst_file | perl -MList::Util=shuffle -e 'print shuffle(<STDIN>);' > $rand_filemv $rand_file $dst_filefi
done

5：create_data.sh

cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )
echo $cur_dir
root_dir=$cur_dir/../..
cd $root_dir
redo=1
data_root_dir="$HOME/data/VOC2007"
dataset_name="DataName"
mapfile="$root_dir/data/$dataset_name/labelmap.prototxt"
anno_type="detection"
db="lmdb"
min_dim=0
max_dim=0
width=0
height=0extra_cmd="--encode-type=jpg --encoded"
if [ $redo ]
thenextra_cmd="$extra_cmd --redo"
fi
for subset in train
dosudo python2 $root_dir/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir $root_dir/data/$dataset_name/$subset.txt $data_root_dir/$dataset_name/$db/$dataset_name"_"$subset"_"$db examples/$dataset_name
done

【Caffe-Ubuntu】JSON 标签生成自己的 Caffe-LMDB 数据文件相关推荐

Vue基于JSON Schema生成表单和数据校验
基于 Vue.js.JSON Schema 和 ElementUi 快速生成表单,支持自定义组件,支持自定义错误提示配置和校验规则... 源码:vue-json-schema-form github ...
【AI】caffe使用步骤（一）：将标注数据生成lmdb或leveldb
1.简述 caffe使用工具 convert_imageset 将标注数据转换成lmdb或leveldb格式,convert_imageset 使用方法可以参考脚本examples/imagenet/ ...
Ubuntu 14.04+cuda 7.5+caffe安装配置
换了新电脑,整个人喜气洋洋,然后就屁颠屁颠地开始配置caffe的使用环境. 可是!!!配置这个坑爹的caffe环境让我重装系统N次加上重装cudaN次,后来发现有好多都是很琐碎的注意事项,好多人都没有 ...
Caffe + Ubuntu 15.04/16.04 + CUDA 7.5/8.0 在服务器上安装配置及卸载重新安装（已测试可执行）
本文参考如下: caffe 安装所需的所有资源可在百度网盘下载链接: http://pan.baidu.com/s/1jIRJ6mU 提取密码:xehi 在服务器上为每个子用户拷贝caffe 使用 ...
从零安装 Caffe (Ubuntu 14.04) Install Caffe in Ubuntu 14.04 from Scratch
Coldmooon's Blog HOME ABOUT CONTACT 从零安装 Caffe (Ubuntu 14.04) Install Caffe in Ubuntu 14.04 from Scr ...
Caffe + Ubuntu 14.04 64bit + CUDA 6.5 配置说明
FROM:https://gist.github.com/realmyth/f368ba0fea429342236c 本步骤能实现用Intel核芯显卡来进行显示, 用NVIDIA GPU进行计算. 1 ...
2015.08.17 Ubuntu 14.04+cuda 7.5+caffe安装配置
2016.06.10 update cuda 7.5 and cudnn v5 2015.10.23更新:修改了一些地方,身边很多人按这个流程安装,完全可以安装折腾了两个星期的caffe,windo ...
ubuntu 14.04 16.04 安装caffe+cuda8.0+pycafee总结
从开学到现在,caffe装了有4-5次了.在这里做个总结,以防那天,自己的电脑又操作失误,又跪! 建议,如果是自己的电脑,能用网线,可以这样搞,因为到最后关机重启后,不知道是什么原因,系统的设置中,好 ...
ubuntu 16.04 官网版安装 caffe 步骤详解[CPU][紧跟官网，永不踩坑]
Reference: CPU: https://www.youtube.com/watch?v=DnIs4DRjNL4 GPU: Part1: https://www.youtube.com/watc ...

【Caffe-Ubuntu】JSON 标签生成自己的 Caffe-LMDB 数据文件

0：生成 LMDB 的流程

1：制作自己的数据集

2 ：JSON 2 VOC2007

3：labelmap.prototxt 设定（以背景，目标两类为例）

4：create_list.sh

5：create_data.sh

【Caffe-Ubuntu】JSON 标签生成自己的 Caffe-LMDB 数据文件相关推荐

最新文章

热门文章