本博客还有多个超详细综述,感兴趣的朋友可以移步:

卷积神经网络:卷积神经网络超详细介绍

目标检测:目标检测超详细介绍

语义分割:语义分割超详细介绍

NMS:让你一文看懂且看全 NMS 及其变体

数据增强:一文看懂计算机视觉中的数据增强

损失函数:分类检测分割中的损失函数和评价指标

Transformer:A Survey of Visual Transformers

机器学习实战系列:决策树

YOLO 系列:v1、v2、v3、v4、scaled-v4、v5、v6、v7、yolof、yolox、yolos、yolop


文章目录

  • 一、mmsegmentation简介
  • 二、Cityscape数据集简介
    • 2.1 数据结构
    • 2.2 标注样例
  • 三、把自己的数据集变成Cityscape格式
    • 3.1 将用labelme标好的数据转换为训练可用数据
    • 3.2 重命名
    • 3.3 xml转json
  • 四、训练和测试
    • 4.1 改数据集路径名称等
    • 4.2 训练
    • 4.3 测试
    • 4.4 demo
  • 五、训练技巧
    • 5.1 不同类别的 loss 权重设置

一、mmsegmentation简介

github 链接:https://github.com/open-mmlab/mmsegmentation

二、Cityscape数据集简介

2.1 数据结构

1、数据集结构:

  • Images_base: leftImg8bit (5030items, totalling 11.6GB, factually 5000items)
  • Annotations_base: gtFine (30030 items, totalling 1.1GB)

2、图片数量

  • 5000张精细标注

    • 2975张训练图,500张训练图,1525张测试图
  • 20000张粗糙标注(使用多边形覆盖单个对象)

3、图像大小:

  • 1024x2048

4、数据场景:

  • 50个不同城市的街景
  • train/val/test的城市都不同

5、类别定义:

  • *:是针对单个实例进行标注的,如果同一类别的多个物体交叉分布,也即实例边界不明显,这些物体组成一个单一实例的group,如 car/bicycle group.
  • +: 表示的label目前还没有包含在任何的评估项中,treated as void;所以去掉这些label,一般我们说CityScapes包含19类。

6、标注方式

Cityscape有自己的标注方法:cityscapescripts
https://github.com/mcordts/cityscapesScripts/tree/master/cityscapesscripts

注:在训练之前,一定要看一下trainlabelid.png里边是不是19个类,最好用cityscapesscripts/helpers/labels.py重新生成一遍会比较保险。

cityscapesScripts/helpers/labels.py文件中定义了不同类别和Id值的对应方式,颜色等,如下图:

  • 总类别:34类
  • 现可用类别:19类
  • 标注文件类型:CityScapes 的 label 不是集中在一个json,而是每张图片对应了4个label文件。
    • color.png: 每个颜色对应一个类别

    • instanceIds.png:用于实例分割

    • labelIds.png:标注label,共34类(0-33),背景为0

    • TrainLabelId.png:训练用的label,(19个训练的类别+1个ignore),将不参与训练的像素值置为255,其余参与训练的类别像素值为0-18

    • polygons.json

  • _gtFine_polygons.json存储的是各个类和相应的区域(用多边形顶点的位置表示区域的边界);

2.2 标注样例






三、把自己的数据集变成Cityscape格式

3.1 将用labelme标好的数据转换为训练可用数据

from pathlib import Path
import os
import glob
import cv2
import png
import numpy as np
import shutil
from PIL import Image# set add_4 and add_15 folder
seg_folder = Path('./20210311_ground_mask_part1/')seg_folder_TrainID = Path(os.path.join(seg_folder,"TrainID"))
seg_folder_img = Path(os.path.join(seg_folder,"img"))
seg_folder_LabelID = Path(os.path.join(seg_folder,"LabelID"))
seg_folder_color = Path(os.path.join(seg_folder,"color"))if not seg_folder_img.exists():os.mkdir(seg_folder_img)
if not seg_folder_LabelID.exists():os.mkdir(seg_folder_LabelID)
if not seg_folder_TrainID.exists():os.mkdir(seg_folder_TrainID)
if not seg_folder_color.exists():os.mkdir(seg_folder_color)LabelID_glob = glob.glob('./20210311_ground_mask_part1/20210222_TJP_freespace_ss/20210222_TJP_freespace_ss_label/*.png')
TrainID_glob = glob.glob('./20210311_ground_mask_part1/20210222_TJP_freespace_ss/20210222_TJP_freespace_ss_label/*.png')
Img_glob = glob.glob('./20210311_ground_mask_part1/20210222_TJP_freespace_ss/20210222_TJP_freespace_ss_extract/*.jpg')
Color_glob = glob.glob('./20210311_ground_mask_part1/20210222_TJP_freespace_ss/20210222_TJP_freespace_ss_label/*.png')
# assert(len(LabelID_glob)==len(Img_glob))print("len for lable glob",len(LabelID_glob))# ******************* TrainID process ****************************
print("begin to process TrainID")
for k in range(len(LabelID_glob)):transfer_ori = Image.open(TrainID_glob[k])transfer_ground = np.array(transfer_ori)transfer_ground[transfer_ground == 0] = 255  # ignoretransfer_ground[transfer_ground == 1] = 0   # freespacetransfer_ground[transfer_ground == 2] = 1  # white solid lane linetransfer_ground[transfer_ground == 3] = 2  # white dotted lane line# transfer_ground[transfer_ground == 4] = 3  # yellow solid lane line# transfer_ground[transfer_ground == 5] = 4  # yellow dotted lane linetransfer_ground[transfer_ground == 6] = 3  # arrowtransfer_ground[transfer_ground == 7] = 4  # diamond_signtransfer_ground[transfer_ground == 8] = 5  # zebra crossingtransfer_ground[transfer_ground == 9] = 6  # stop linetransfer_ground_img = Image.fromarray(transfer_ground)transfer_ground_img = transfer_ground_img.resize((2048, 1024))transfer_ori_path = os.path.join(seg_folder_TrainID,TrainID_glob[k].split('/')[-1].split('\\')[1])transfer_ground_img.save(transfer_ori_path)print("the {0} th TrainID img has been processed and save in folder".format(k))
#
# # ******************* LableID process ****************************
print("begin to process LableID")
for k in range(len(LabelID_glob)):transfer_ori = Image.open(TrainID_glob[k])transfer_ground = np.array(transfer_ori)transfer_ground[transfer_ground == 0] = 0  # ignoretransfer_ground[transfer_ground == 1] = 1  # freespacetransfer_ground[transfer_ground == 2] = 2  # white solid lane linetransfer_ground[transfer_ground == 3] = 3  # white dotted lane line# transfer_ground[transfer_ground == 4] = 4  # yellow solid lane line# transfer_ground[transfer_ground == 5] = 5  # yellow dotted lane linetransfer_ground[transfer_ground == 6] = 4  # arrowtransfer_ground[transfer_ground == 7] = 5  # diamond_signtransfer_ground[transfer_ground == 8] = 6  # zebra crossingtransfer_ground[transfer_ground == 9] = 7  # stop linetransfer_ground_img = Image.fromarray(transfer_ground)transfer_ground_img = transfer_ground_img.resize((2048, 1024))transfer_ori_path = os.path.join(seg_folder_TrainID, TrainID_glob[k].split('/')[-1].split('\\')[1])transfer_ground_img.save(transfer_ori_path)print("the {0} th LabelID img has been processed and save in folder".format(k))# # ******************** resize img ***********************************
for k in range(len(Img_glob)):print("copy the {0}th img to add img folder".format(k))src_img = Image.open(Img_glob[k])src_img = src_img.resize((2048, 1024))src_img_save_path = os.path.join(seg_folder_img,Img_glob[k].split('/')[-1].split('\\')[1].split('.')[0])src_img.save(src_img_save_path+'.png')
#
# ## ********************* resize color png *****************************
for k in range(len(Color_glob)):print("copy the {0}th img to color folder".format(k))src_img = Image.open(Color_glob[k])src_img = src_img.resize((2048,1024))color_img_save_path = os.path.join(seg_folder_color,Color_glob[k].split('/')[-1].split('\\')[1].split('.')[0])src_img.save(color_img_save_path+'.png')

3.2 重命名

import os
import glob
import shutil
from pathlib import Pathimg_path = './img/'
TrainID_path = './TrainID/'
LabelID_path = './LabelID/'
color_path = './color/'
gtFine_path = './gtFine/'
leftImg8bit_path = './leftImg8bit/'if not Path(gtFine_path).exists():os.mkdir(gtFine_path)
if not Path(leftImg8bit_path).exists():os.mkdir(leftImg8bit_path)img_files = os.listdir(img_path)
TrainID_files = os.listdir(TrainID_path)
LabelID_files = os.listdir(LabelID_path)
color_files = os.listdir(color_path)m = 0
for file in color_files:#import pdb;pdb.set_trace()old = color_path + os.sep + color_files[m]filename = os.path.splitext(file)[0]new = gtFine_path + 'part1_' + filename + '_gtFine_color.png'shutil.move(old, new)print('rename {}th color files'.format(m))m+=1i = 0
for file in img_files:#import pdb;pdb.set_trace()old = img_path + os.sep + img_files[i]filename = os.path.splitext(file)[0]new = leftImg8bit_path+ 'part1_' + filename + '_leftImg8bit.png'shutil.move(old, new)print('rename {}th img files'.format(i))i+=1
j = 0
for file in TrainID_files:# import pdb;pdb.set_trace()old = TrainID_path + os.sep + TrainID_files[j]filename = os.path.splitext(file)[0]new = gtFine_path + 'part1_' + filename + '_gtFine_labelTrainIds.png'shutil.move(old, new)print('rename {}th trainid files'.format(j))j += 1
k = 0
for file in LabelID_files:# import pdb;pdb.set_trace()old = LabelID_path + os.sep + LabelID_files[k]filename = os.path.splitext(file)[0]new = gtFine_path + 'part1_' + filename + '_gtFine_labelIds.png'shutil.move(old, new)print('rename {}th labelid files'.format(k))k += 1
  • color

  • labeltrainid

  • labelid

3.3 xml转json

import json
import xmltodict
import glob
import osxml_list = glob.glob('./20210222_TJP_freespace_ss_xml/*.xml') #xml文件的路径'''json to xml'''
def json_to_xml(json_str):# xmltodict库的unparse()json转xml# 参数pretty 是格式化xmlxml_str = xmltodict.unparse(json_str, pretty=1, root='shapes')return xml_str'''xml to json'''
def xml_to_json(xml_str):# parse是的xml解析器xml_parse = xmltodict.parse(xml_str)# json库dumps()是将dict转化成json格式,loads()是将json转化成dict格式。# dumps()方法的ident=1,格式化jsonjson_str = json.dumps(xml_parse, indent=1)return json_strfor xml_path in xml_list:if os.path.exists(xml_path):with open(xml_path, 'r') as f1:xmlfile = f1.read()print('---------xml文件-----------')print(xmlfile)print('---------json文件----------')print(xml_to_json(xmlfile))with open(xml_path[:-3]+'json','w') as newfile:newfile.write(xml_to_json(xmlfile))print('--------写入json文件--------')print('写入xml.json文件成功')

四、训练和测试

4.1 改数据集路径名称等

注意:首先编译路径 python setup.py develop

1、修改数据集路径

mmseg/datasets/cityscapes.py

2、修改类别数量

config/_base_/ocrnet_hr18.py

mmseg/datasets/cityscapes.py

mmseg/core/class_names.py


3、使用labelid.png作为训练的label,即训练freespace和background两个类别:

如果只训练freespace这一类,而把background按照通常的方法作为ignore不参与训练的话,可视化的时候就会发现背景全部学成了freespace这一个类,因为分割中的忽略其实是不关注的像素点,也就是说没有人关心它们被预测成了什么类别,但如果图像中有大面积的ignore,也就是有大面积的错类别,这样的模型是无法使用的,所以这里使用labelid.png作为训练的label,即训练freespace和background两个类别,这样训练的效果会很好。

mmseg/datasets/cityscapes.py

4、修改训练、测试、验证数据

mmseg/datasets/cityscapes.py

5、修改每个GPU的样本数量(batch)

configs/_base_/datasets/cityscapes.py

6、分布式训练/非分布式训练的 BN 修改

config/_base_/models/ocrnet_hr18.py


7、修改迭代次数和保存模型的间隔

config/_base_/schedules/schedule_160k.py

8、在哪里看每次训练的数据:

mmseg/datasets/builder.py


datasets[0].keys()
>>>
dict_keys(['img_metas', 'img', 'gt_semantic_seg'])

4.2 训练

1、单卡训练

python tools/train.py ${CONFIG_FILE} [optional arguments]

2、多卡训练

./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
python -m torch.distributed.launch --nproc_per_node=2 --master_port=29003  tools/train.py --config configs/ocrnet/ocrnet_hr18s_512x1024_40k_cityscapes.py --launcher pytorch --work_dir work_dir

如果报错如下,解决方法:修改master_port 的值即可

subprocess.CalledProcessError: Command '[xxx]' returned non-zero exit status 1.
Optional arguments are:
--no-validate (not suggested): By default, the codebase will perform evaluation at every k iterations during the training. To disable this behavior, use --no-validate.
--work-dir ${WORK_DIR}: Override the working directory specified in the config file.
--resume-from ${CHECKPOINT_FILE}: Resume from a previous checkpoint file (to continue the training process).
--load-from ${CHECKPOINT_FILE}: Load weights from a checkpoint file (to start finetuning for another task).
Difference between resume-from and load-from:
resume-from loads both the model weights and optimizer state including the iteration number.
load-from loads only the model weights, starts the training from iteration 0.

4.3 测试

# single-gpu testing
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]# save test result at dir
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--show-dir result]# multi-gpu testing
./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
Optional arguments:RESULT_FILE: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.
EVAL_METRICS: Items to be evaluated on the results. Allowed values depend on the dataset, e.g., mIoU is available for all dataset. Cityscapes could be evaluated by cityscapes as well as standard mIoU metrics.
--show: If specified, segmentation results will be plotted on the images and shown in a new window. It is only applicable to single GPU testing and used for debugging and visualization. Please make sure that GUI is available in your environment, otherwise you may encounter the error like cannot connect to X server.
--show-dir: If specified, segmentation results will be plotted on the images and saved to the specified directory. It is only applicable to single GPU testing and used for debugging and visualization. You do NOT need a GUI available in your environment for using this option.

Examples:

Assume that you have already downloaded the checkpoints to the directory checkpoints/.

  • Test PSPNet and visualize the results. Press any key for the next image.
python tools/test.py configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py \checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth \--show
  • Test PSPNet and save the painted images for latter visualization.
python tools/test.py configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py \checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth \--show-dir psp_r50_512x1024_40ki_cityscapes_results
  • Test PSPNet on PASCAL VOC (without saving the test results) and evaluate the mIoU.
python tools/test.py configs/pspnet/pspnet_r50-d8_512x1024_20k_voc12aug.py \checkpoints/pspnet_r50-d8_512x1024_20k_voc12aug_20200605_003338-c57ef100.pth \--eval mAP
  • Test PSPNet with 4 GPUs, and evaluate the standard mIoU and cityscapes metric.
./tools/dist_test.sh configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py \checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth \4 --out results.pkl --eval mIoU cityscapes

Note: There is some gap (~0.1%) between cityscapes mIoU and our mIoU. The reason is that cityscapes average each class with class size by default. We use the simple version without average for all datasets.

  • Test PSPNet on cityscapes test split with 4 GPUs, and generate the png files to be submit to the official evaluation server.

First, add following to config file configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py,

data = dict(test=dict(img_dir='leftImg8bit/test',ann_dir='gtFine/test'))

Then run test.

./tools/dist_test.sh configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py \checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth \4 --format-only --eval-options "imgfile_prefix=./pspnet_test_results"

You will get png files under ./pspnet_test_results directory. You may run zip -r results.zip pspnet_test_results/ and submit the zip file to evaluation server.

4.4 demo

python demo/image_demo.py ${IMAGE_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ${DEVICE_NAME}] [--palette-thr ${PALETTE}]
# example
python demo/image_demo.py demo/demo.jpg configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py \checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth --device cuda:0 --palette cityscapes

注:如果用相同的config训练多次,可以通过改config/hr.py的名字来把结果放到不同的 work-dir 中。

五、训练技巧

5.1 不同类别的 loss 权重设置

/configs/_base_/models/ocrnet_hr18.py

【语义分割】3、用mmsegmentation训练自己的分割数据集相关推荐

  1. MMSegmentation训练自己的分割数据集

    首先确保在服务器上正常安装了MMSeg,注意安装完还需建立与自己的数据集之间的软连接,官方安装教程如下: https://github.com/open-mmlab/mmsegmentation/bl ...

  2. 图像语义分割python_图像语义分割 —利用Deeplab v3+训练VOC2012数据集

    原标题:图像语义分割 -利用Deeplab v3+训练VOC2012数据集 前言: 配置:windows10 + Tensorflow1.6.0 + Python3.6.4(笔记本无GPU) 源码: ...

  3. mmsegmentation 训练自制数据集全过程

    1.简介 mmsegmentation是目前比较全面和好用的用于分割模型的平台,原始的github链接 https://github.com/open-mmlab/mmsegmentation 2.G ...

  4. mmsegmentation 训练自己的数据集

    open-mmlab有许多非常实用的框架,其中目标检测的话mmdetection确实很实用.但语义分割的话当属mmsegmentation,这篇博客就是介绍如何用mmsegmentation训练自己的 ...

  5. AI深度学习入门与实战19 语义分割:打造简单高效的人像分割模型

    上一讲,我向你介绍了语义分割的原理.在理解上一课时中 U-Net 语义分割网络的基础上,这一讲,让我们来实际构建一个人像分割模型吧. 语义分割的评估 我们先简单回顾一下语义分割的目的:把一张图中的每一 ...

  6. pascal行人voc_在一个很小的Pascal VOC数据集上训练一个实例分割模型

    只使用1349张图像训练Mask-RCNN,有代码. 代码:https://github.com/kayoyin/tiny-inst-segmentation 介绍 计算机视觉的进步带来了许多有前途的 ...

  7. MMSegmentation 训练测试全流程

    MMSegmentation 训练测试全流程 1.按照执行顺序的流程梳理 Level 0: 运行 Shell 命令: Level 1: 在 tools/train.py 内: Level 2: 转进到 ...

  8. 有空就学学的实例分割1——Tensorflow2搭建Mask R-CNN实例分割平台

    有空就学学的实例分割1--Tensorflow2搭建Mask R-CNN实例分割平台 学习前言 什么是Mask R-CNN 源码下载 Mask R-CNN实现思路 一.预测部分 1.主干网络介绍 2. ...

  9. 荟聚NeurIPS顶会模型、智能标注10倍速神器、人像分割SOTA方案、3D医疗影像分割利器,PaddleSeg重磅升级!

    导读 图像分割是计算机视觉三大任务之一,基于深度学习的图像分割技术也发挥日益重要的作用,广泛应用于智慧医疗.工业质检.自动驾驶.遥感.智能办公等行业. 然而在实际业务中,图像分割依旧面临诸多挑战,比如 ...

  10. DL之Panoptic Segmentation:Panoptic Segmentation(全景分割)的简介(论文介绍)、全景分割挑战简介、案例应用等配图集合之详细攻略

    DL之Panoptic Segmentation:Panoptic Segmentation(全景分割)的简介(论文介绍).全景分割挑战简介.案例应用等配图集合之详细攻略 目录 Panoptic Se ...

最新文章

  1. 用js写一个模板引擎
  2. BZOJ-2748: [HAOI2012]音量调节 (傻逼背包DP)
  3. 三菱plc 与 计算机 通讯,PC与三菱PLC之间的RS232通讯协议
  4. 个人开发者 android内购,【开发者账号】关于内购,协议税务的一些坑
  5. Tomcat 配置文件
  6. 边界信任模型,零信任模型
  7. jq点击更多收起效果
  8. 在shell或bash执行一个bin文件或者脚本的流程
  9. H5唤起APP客户端
  10. 服务器系统开启telnet,Telnet怎么打开 Win7/Win8系统开启Telnet服务方法图解
  11. Photoshop从入门到发疯(一)身份证添加水印
  12. jsp190ssm健身俱乐部会员管理系统
  13. 关于本号,你想看的都在这里
  14. python动画篮球大小_篮球比赛动画直播数据api接口示例
  15. 逻辑运算符及其优先级,C语言逻辑运算符及其优先级详解
  16. 毛利率、净利率和成本利润率的区别是什么 ?
  17. hosts文件如何修改?已解决
  18. 无人驾驶汽车系统入门(十七)——无人驾驶系统基本框架
  19. 最近在逛知乎的时候发现一个有趣的问题:公司规定所有接口都用 post 请求,这是为什么?
  20. pandas--数据预处理

热门文章

  1. js函数传参传入对象
  2. Word如何使用预设样式、自定样式以及生成自动目录教程
  3. 用VBA破解Excel密码
  4. mysql中的分隔符有哪些_MySQL中的分隔符
  5. Excel--单元格格式设置
  6. “学习方法”学习笔记(一)费曼技巧
  7. SpringMVCfrom:form表单标签和input表单标签简介
  8. 利用F12下载网页高清图像
  9. python报错Statements must be separated by newlines or semicolons解决方法
  10. 硬改TP-WR886N v5 路由器刷入源码编译的openWRT/LEDE系统