本博文介绍如何手把手制作自己的数据集,并使用SegNet网络进行语义分割,如何训练,如何测试自己的分割模型。

----------------------------------------------------------------------------------------------------------------------------------------------------------

感谢:

1.源码参考cudnn5版本的caffe-segnet,https://github.com/TimoSaemann/caffe-segnet-cudnn5

2.作者的官方指导文档:http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html

3.一份使用指导:https://github.com/TqDavid/SegNet-Tutorial/tree/master/CamVid

4.如果想测试SegNet的效果,这里有官方的在线测试demo: http://mi.eng.cam.ac.uk/projects/segnet/#demo

5.一份segnet的预训练model

https://github.com/alexgkendall/SegNet-Tutorial/blob/master/Example_Models/segnet_model_zoo.md,以及相对应的
segnet_basic_inference.prototxt
segnet_basic_solver.prototxt
segnet_basic_train.prototxt

6.参考了该博主的文章https://blog.csdn.net/caicai2526/article/details/77170223

7.以及该博主的文章https://blog.csdn.net/hjxu2016/article/details/77994925

8.对于已有mask和image,需要resize,可参考https://blog.csdn.net/u013249853/article/details/79827469

--------------------------------------------------------------------------------------------------------------------------------------------------------------

言归正传,开始正题。

下载上述caffe-segnet-cudnn5版本的代码,其路径后文称之为caffe_root,以下的操作都在该caffe_root目录下进行。

整体目录包含如下文件(夹):

一、制作自己数据集

SegNet网络需要两个输入,一个是原始图像,三通道的;一个是mask label,也就是待分割目标的标签,mask要求是单通道8位灰度图,格式是uint8, 像素0表示背景,像素1表示目标的位置。 下面分别准备这两样东西。

1. 原始图像就是你自己的照片了,注意一下:如果你不更改SegNet中参数,可以先保持与原作者的图像尺寸一致,后面可以根据自己的情况在更改,一般上采样2x,这里采用与原作者相同的图像尺寸,(360, 480)彩色图片,如果不是这样,建议resize一下,博主自己的图片是灰度的,resize前有一个转成rgb的过程,python代码函数;

# testdir为自己数据集路径,楼主的图是灰度图,所以有一个转换成rgb的过程,然后保存。
    def resize_image_batch():
        file_list = os.listdir(testdir)
        for item in file_list:
            #print(traindir + item)
            imname = testdir + item
            im = cv2.imread(imname, 0)
            im = cv2.resize(im, (480, 360))
            im_RGB = cv2.cvtColor(im, cv2.COLOR_GRAY2RGB)
            print(im.shape)
            print(im_RGB.shape)
            #print(im)
            print("============================")
            #print(im_RGB[:,:,0])
            #print(im_RGB[:,:,1])
            #print(im_RGB[:,:,2])
            print(im.dtype)
            print(im_RGB.dtype)
            cv2.imwrite('zht_test_use/'+ item, im_RGB)
            #print("success!")

2.准备mask。

首先,博主的mask来源是用labelme 得到的。labelme的使用方法,这里不再赘述。简言之,对于每一张图,我们可以得到其

json标注文件, 再通过,labelme_json_to_dataset命令,可以得到最终我们需要的label mask,我们所需的是文件夹下的label.png。

labelme_json_to_dataset批量生成数据集的shell脚本如下;

for item in $(find ./huaweishouji_20170720_train_360x480_try -iname "*.json");
    do    
     echo $item;     
     labelme_json_to_dataset $item;
    done

接下来将lable.png转换成uint8,背景像素是0,目标像素是1的mask..convert_to_mask 代码如下:

def convert_to_mask():
        file_list = os.listdir(traindir)
        for item in file_list:
            item_list = os.listdir(traindir + item)
            #print("len(item) : ", len(item_list))
            for atom in item_list:
                if atom == "label.png":
                    np.set_printoptions(threshold='nan')
                    imname = traindir + item + "/" + atom
                    #print(imname)
                    im = io.imread(imname, 1)
                    print(imname)
                    #print(im[:, :])
                    print(im.shape)
                    print(im.dtype)
                    print("-------------after-----------------")
                    img = (im * 255).astype(np.uint8)
                    _, im_th= cv2.threshold(img, 0.0000000000000001, 1, cv2.THRESH_BINARY)
                      
                    #print(img.shape)
                    
                    print(im_th.shape)
                    print(im_th.dtype)
                    #print(im_th[ :, :])
                    print(item[:-5] + ".png")
                    cv2.imwrite(train_write_dir + item[:-5] + '.png', im_th )
                    #print(im[:,:,0])
                    #print(im[:,:,1])
                    #print(im[:,:,2])
                    #im_RGB = cv2.cvtColor(im, cv2.COLOR_GRAY2RGB)
            #print(traindir + item)
            #imname = traindir + item
            #im = cv2.imread(imname, 0)
            #im = cv2.resize(im, (480, 360))
            #im_RGB = cv2.cvtColor(im, cv2.COLOR_GRAY2RGB)
            
            #print(im.shape)
            #print(im_RGB.shape)
            #print(im)
            #print("============================")
            #print(im_RGB[:,:,0])
            #print(im_RGB[:,:,1])
            #print(im_RGB[:,:,2])
            #print(im.dtype)
            #print(im_RGB.dtype)
            #cv2.imwrite('huaweishouji_20170720_test_360x480/'+ item, im_RGB)
    def check_mask():
        file_list = os.listdir(train_write_dir)
        for item in file_list:
            np.set_printoptions(threshold='nan')
            #item_list = os.listdir(test_write_dir)
            imname = train_write_dir + item
            #print("len(item) : ", len(item_list))
            print(item)

check_mask()是检验函数。验证背景是0,目标是1,且为uint8编码。

将上述原始图像和mask,划分出训练集和测试集,分别放置在caffe_root/data/mydata/train和test下。mydata/下的结构是如图,

,其中train/和test/ 结构如下:,image/和mask/下分别是上面得到的图像和mask文件。注意图像和mask名称保持一致。

3.准备上面mydata/下的train.txt和test.txt,然后我们需要制作一个txt的列表,左边是原图的根目录路径,右边是mask图的根目录路径,中间以空格隔开,一定要注意,mask的路径和原图的路径一定要对,列表形式如下:

/home/hjxu/caffe_examples/segnet_xu/data/test/image/8900_11800.tiff /home/hjxu/caffe_examples/segnet_xu/data/test/mask/8500_10200_ConfidenceMap.png
/home/hjxu/caffe_examples/segnet_xu/data/test/image/10100_9800.tiff /home/hjxu/caffe_examples/segnet_xu/data/test/mask/8900_11800_ConfidenceMap.png
/home/hjxu/caffe_examples/segnet_xu/data/test/image/8900_9000.tiff /home/hjxu/caffe_examples/segnet_xu/data/test/mask/9300_10200_ConfidenceMap.png
/home/hjxu/caffe_examples/segnet_xu/data/test/image/8900_10200.tiff /home/hjxu/caffe_examples/segnet_xu/data/test/mask/8900_9000_ConfidenceMap.png

具体的shell脚本

#!/usr/bin/env sh
    DATA_train=/home/ccf/CCF/Cell_segnet/data/data_train_enhancement/train/image
    MASK_train=/home/ccf/CCF/Cell_segnet/data/data_train_enhancement/train/mask
    DATA_test=/home/ccf/CCF/Cell_segnet/data/data_train_enhancement/test/image
    MASK_test=/home/ccf/CCF/Cell_segnet/data/data_train_enhancement/test/mask
    MY=/home/ccf/CCF/Cell_segnet/data/data_train_enhancement
     
    ################################################
    rm -rf $MY/train.txt
     
    echo "Create train.txt"
    find $DATA_train/ -name "*.tif">>$MY/img.txt
    find $MASK_train/ -name "*.tif">>$MY/mask.txt
    paste -d " " $MY/img.txt $MY/mask.txt>$MY/train.txt
     
    rm -rf $MY/img.txt
    rm -rf $MY/mask.txt
     
    ##################################################
    rm -rf $MY/test.txt
     
    echo "Create test.txt"
    find $DATA_test/ -name "*.tif">>$MY/img.txt
    find $MASK_test/ -name "*.tif">>$MY/mask.txt
    paste -d " " $MY/img.txt $MY/mask.txt>$MY/test.txt
     
    rm -rf $MY/img.txt
    rm -rf $MY/mask.txt

可以适当修改,以适应自己的路径。

二、训练

训练之前,在训练的时候我们可以根据自己训练的要求更改分割的类型,segnet对原来是11中类型,在博主只有两种类型,这就会遇到对网络的修改,同时数据输入的也是一样原来的是360*480,网络中的修改根据个人的要求以及效果进行修改,修改输出层参数num_output为2,以及class-weighting,只有两个。要注意的是上采样upsample这个参数的修改,以及最后的class_weighting,对于class_weighting个数以及参数是根据自己的数据以及要求设定,输出几个类别class_weighting就有几个,对于class_weighting参数的确定是根据训练数据的mask中每一种类型的label确定的,就算方法:(all_label/class)/label,下面是计算的matlab算法代码:

clear;
    clc;
    Path='C:\\Users\\xxxxx\\Desktop\\mask\\'
    % save_mask_path='/home/ccf/CCF/Cell_segnet/data/mask_change/'
    files=dir(Path);
     
    %element = [];
    for k=3:length(files)
        k
        subpath=[Path,files(k).name];
        name=files(k).name;
        image=imread(subpath);
        I=image;
        name
        I
        img=uint8(zeros(360,480));
        [x,y]=find(I==0);
        for i=1:length(x)
            img(x(i),y(i))=0;
        end
        [x,y]=find(I==1);
        for i=1:length(x)
            img(x(i),y(i))=1;
        end
    %     imwrite(img,[save_mask_path,name]);
        label_num=double(unique(img));
        element(:,1)=[0;1];
        if (length(element(:,1))==length(label_num))
            element(:,1)=label_num;
        end
        for j=1:length(label_num)
            a=label_num(j);
            e=length(find(img==a));
            element(j,i-1)=e;
        end
    end
    num=element(:,2:end);
    sum_num=sum(num,2);
    median=sum(sum_num)/length(sum_num);
    class_weighting=median./sum_num;
    total=[element(:,1),class_weighting];
    save('class_weight.mat','total');

最后查看class_weighting中保存的值即可,博主是2类,所有最后生成的class_weighting有两个值,ignore_label=2,num_outout  = 2;完毕。

训练模型需要的东西:

预训练模型:

Segnet Basic model file: segnet_basic_camvid.prototxt weights: [http://mi.eng.cam.ac.uk/~agk34/resources/SegNet/segnet_basic_camvid.caffemodel]

solver.prototxt如下所示,train.prototxt就不贴了,因为有些长。

net: "examples/segnet/segnet_train.prototxt"          # Change this to the absolute path to your model file
    test_initialization: false
    test_iter: 1
    test_interval: 10000000
    base_lr: 0.01 #0.1
    lr_policy: "step"
    gamma: 1.0
    stepsize: 50 #10000000
    display: 20
    momentum: 0.9
    max_iter: 50000
    weight_decay: 0.0005
    snapshot: 50#1000
    snapshot_prefix: "examples/segnet/segnet_train/segnet_basic/seg"      # Change this to the absolute path to where you wish to output solver snapshots
    solver_mode: GPU

#!bin/sh
    ./build/tools/caffe train -gpu 2,3 -solver examples/segnet/segnet_solver.prototxt -weights examples/segnet/segnet_train/premodel/segnet_basic_camvid.caffemodel # This will begin training SegNet-Basic on GPU 0

上面就是训练脚本了。完毕。

三、测试

1.生成含bn层的推理模型,脚本是:

#!bin/sh
    echo "------------------generating bn statistics is begin-----------------------------"
    python generate_bn_statistics.py examples/segnet/segnet_train.prototxt examples/segnet/segnet_train/segnet_basic/seg_iter_15700.caffemodel models/inference  # compute BN statistics for SegNet
    echo "------------------generating bn statistics is end-----------------------------"

generate_bn_statistics.py如下:

#-*-coding:utf8-*-
    #!/usr/bin/env python
    import os
    import numpy as np
    from skimage.io import ImageCollection
    from argparse import ArgumentParser
     
     
     
     
    caffe_root = '/data/xxxxxx/caffe-segnet-cudnn/caffe-segnet-cudnn5/'             # Change this to the absolute directoy to SegNet Caffe
    import sys
    sys.path.insert(0, caffe_root + 'python')
     
    import caffe
    from caffe.proto import caffe_pb2
    from google.protobuf import text_format
     
     
    def extract_dataset(net_message):
        assert net_message.layer[0].type == "DenseImageData"
        source = net_message.layer[0].dense_image_data_param.source
        with open(source) as f:
            data = f.read().split()
        ims = ImageCollection(data[::2])
        labs = ImageCollection(data[1::2])
        assert len(ims) == len(labs) > 0
        return ims, labs
     
     
    def make_testable(train_model_path):
        # load the train net prototxt as a protobuf message
        with open(train_model_path) as f:
            train_str = f.read()
        train_net = caffe_pb2.NetParameter()
        text_format.Merge(train_str, train_net)
     
        # add the mean, var top blobs to all BN layers
        for layer in train_net.layer:
            if layer.type == "BN" and len(layer.top) == 1:
                layer.top.append(layer.top[0] + "-mean")
                layer.top.append(layer.top[0] + "-var")
     
        # remove the test data layer if present
        if train_net.layer[1].name == "data" and train_net.layer[1].include:
            train_net.layer.remove(train_net.layer[1])
            if train_net.layer[0].include:
                # remove the 'include {phase: TRAIN}' layer param
                train_net.layer[0].include.remove(train_net.layer[0].include[0])
        return train_net
     
     
    def make_test_files(testable_net_path, train_weights_path, num_iterations,
                        in_h, in_w):
        # load the train net prototxt as a protobuf message
        with open(testable_net_path) as f:
            testable_str = f.read()
        testable_msg = caffe_pb2.NetParameter()
        text_format.Merge(testable_str, testable_msg)
        
        bn_layers = [l.name for l in testable_msg.layer if l.type == "BN"]
        bn_blobs = [l.top[0] for l in testable_msg.layer if l.type == "BN"]
        bn_means = [l.top[1] for l in testable_msg.layer if l.type == "BN"]
        bn_vars = [l.top[2] for l in testable_msg.layer if l.type == "BN"]
     
        net = caffe.Net(testable_net_path, train_weights_path, caffe.TEST)
        
        # init our blob stores with the first forward pass
        res = net.forward()
        bn_avg_mean = {bn_mean: np.squeeze(res[bn_mean]).copy() for bn_mean in bn_means}
        bn_avg_var = {bn_var: np.squeeze(res[bn_var]).copy() for bn_var in bn_vars}
     
        # iterate over the rest of the training set
        for i in xrange(1, num_iterations):
            res = net.forward()
            for bn_mean in bn_means:
                bn_avg_mean[bn_mean] += np.squeeze(res[bn_mean])
            for bn_var in bn_vars:
                bn_avg_var[bn_var] += np.squeeze(res[bn_var])
            print 'progress: {}/{}'.format(i, num_iterations)
     
        # compute average means and vars
        for bn_mean in bn_means:
            bn_avg_mean[bn_mean] /= num_iterations
        for bn_var in bn_vars:
            bn_avg_var[bn_var] /= num_iterations
     
        for bn_blob, bn_var in zip(bn_blobs, bn_vars):
            m = np.prod(net.blobs[bn_blob].data.shape) / np.prod(bn_avg_var[bn_var].shape)
            bn_avg_var[bn_var] *= (m / (m - 1))
     
        # calculate the new scale and shift blobs for all the BN layers
        scale_data = {bn_layer: np.squeeze(net.params[bn_layer][0].data)
                      for bn_layer in bn_layers}
        shift_data = {bn_layer: np.squeeze(net.params[bn_layer][1].data)
                      for bn_layer in bn_layers}
     
        var_eps = 1e-9
        new_scale_data = {}
        new_shift_data = {}
        for bn_layer, bn_mean, bn_var in zip(bn_layers, bn_means, bn_vars):
            gamma = scale_data[bn_layer]
            beta = shift_data[bn_layer]
            Ex = bn_avg_mean[bn_mean]
            Varx = bn_avg_var[bn_var]
            new_gamma = gamma / np.sqrt(Varx + var_eps)
            new_beta = beta - (gamma * Ex / np.sqrt(Varx + var_eps))
     
            new_scale_data[bn_layer] = new_gamma
            new_shift_data[bn_layer] = new_beta
        print "New data:"
        print new_scale_data.keys()
        print new_shift_data.keys()
     
        # assign computed new scale and shift values to net.params
        for bn_layer in bn_layers:
            net.params[bn_layer][0].data[...] = new_scale_data[bn_layer].reshape(
                net.params[bn_layer][0].data.shape
            )
            net.params[bn_layer][1].data[...] = new_shift_data[bn_layer].reshape(
                net.params[bn_layer][1].data.shape
            )
            
        # build a test net prototxt
        test_msg = testable_msg
        # replace data layers with 'input' net param
        data_layers = [l for l in test_msg.layer if l.type.endswith("Data")]
        for data_layer in data_layers:
            test_msg.layer.remove(data_layer)
        test_msg.input.append("data")
        test_msg.input_dim.append(1)
        test_msg.input_dim.append(3)
        test_msg.input_dim.append(in_h)
        test_msg.input_dim.append(in_w)
        # Set BN layers to INFERENCE so they use the new stat blobs
        # and remove mean, var top blobs.
        for l in test_msg.layer:
            if l.type == "BN":
                if len(l.top) > 1:
                    dead_tops = l.top[1:]
                    for dl in dead_tops:
                        l.top.remove(dl)
                l.bn_param.bn_mode = caffe_pb2.BNParameter.INFERENCE
        # replace output loss, accuracy layers with a softmax
        dead_outputs = [l for l in test_msg.layer if l.type in ["SoftmaxWithLoss", "Accuracy"]]
        out_bottom = dead_outputs[0].bottom[0]
        for dead in dead_outputs:
            test_msg.layer.remove(dead)
        test_msg.layer.add(
            name="prob", type="Softmax", bottom=[out_bottom], top=['prob']
        )
        return net, test_msg
     
     
    def make_parser():
        p = ArgumentParser()
        p.add_argument('train_model')
        p.add_argument('weights')
        p.add_argument('out_dir')
        return p
     
     
    if __name__ == '__main__':
        caffe.set_mode_gpu()
        p = make_parser()
        args = p.parse_args()
     
        # build and save testable net
        if not os.path.exists(args.out_dir):
            os.makedirs(args.out_dir)
        print "Building BN calc net..."
        testable_msg = make_testable(args.train_model)
        BN_calc_path = os.path.join(
            args.out_dir, '__for_calculating_BN_stats_' + os.path.basename(args.train_model)
        )
        with open(BN_calc_path, 'w') as f:
            f.write(text_format.MessageToString(testable_msg))
     
        # use testable net to calculate BN layer stats
        print "Calculate BN stats..."
        train_ims, train_labs = extract_dataset(testable_msg)
        train_size = len(train_ims)
        minibatch_size = testable_msg.layer[0].dense_image_data_param.batch_size
        num_iterations = train_size // minibatch_size + train_size % minibatch_size
        in_h, in_w =(360, 480)   #记得修改和自己图片一样的大小
        test_net, test_msg = make_test_files(BN_calc_path, args.weights, num_iterations,
                                             in_h, in_w)
        
        # save deploy prototxt
        #print "Saving deployment prototext file..."
        #test_path = os.path.join(args.out_dir, "deploy.prototxt")
        #with open(test_path, 'w') as f:
        #    f.write(text_format.MessageToString(test_msg))
        
        print "Saving test net weights..."
        test_net.save(os.path.join(args.out_dir, "test_weights_15750.caffemodel"))   #记得修改迭代多少次命名
        print "done"

2.生成预测图片

脚本:

#!bin/sh
    echo "-------------------test segmentation is begin---------------------"
    python test_segmentation.py --model models/inference/segmentation_inference.prototxt --weights models/inference/test_weights_15750.caffemodel --iter 26 #12250 #15750  # Test SegNet
    echo "-------------------test segmentation is end---------------------"

test_segmentation.py

#-*-coding=utf8-*-
    import numpy as np
    import matplotlib.pyplot as plt
    import os.path
    import json
    import scipy
    import argparse
    import math
    import pylab
    from sklearn.preprocessing import normalize
    import cv2
    caffe_root = '/data/xxxxxx/caffe-segnet-cudnn/caffe-segnet-cudnn5/'             # Change this to the absolute directoy to SegNet Caffe
    import sys
    sys.path.insert(0, caffe_root + 'python')
     
    import caffe
     
    # Import arguments
    parser = argparse.ArgumentParser()
    parser.add_argument('--model', type=str, required=True)
    parser.add_argument('--weights', type=str, required=True)
    parser.add_argument('--iter', type=int, required=True)
    args = parser.parse_args()
     
    caffe.set_mode_gpu()
     
    net = caffe.Net(args.model,
                    args.weights,
                    caffe.TEST)
     
     
    for i in range(0, args.iter):
     
        net.forward()
        print(i)
        image = net.blobs['data'].data
        #print(image.shape)
        label = net.blobs['label'].data
        #print(label.shape)
        predicted = net.blobs['prob'].data #predicted: float32
        
        # convert np.float64 to np.uint8
        #image = (image* 50000).astype(np.uint8)
        #lahel = (label * 50000).astype(np.uint8)
        #predicted = (predicted * 50000).astype(np.uint8)
     
        #print(predicted.shape)
        image = np.squeeze(image[0,:,:,:])
        output = np.squeeze(predicted[0,:,:,:])
        ind = np.argmax(output, axis=0)
        cv2.imwrite(str(i%26) + "predicted.png", ind * 100)# predicted: float32, this predicated is kuoda * 100
        r = ind.copy()
        g = ind.copy()
        b = ind.copy()
        r_gt = label.copy()
        g_gt = label.copy()
        b_gt = label.copy()
        #print(output.shape)
        #print(output.dtype)
         #print(output)
    #    Sky = [128,128,128]
    #    Building = [128,0,0]
    #    Pole = [192,192,128]
    #    Road_marking = [255,69,0]
    #    Road = [128,64,128]
    #    Pavement = [60,40,222]
    #    Tree = [128,128,0]
    #    SignSymbol = [192,128,128]
    #    Fence = [64,64,128]
    #    Car = [64,0,128]
    #    Pedestrian = [64,64,0]
    #    Bicyclist = [0,128,192]
    #    Unlabelled = [0,0,0]
     
    #    label_colours = np.array([Sky, Building, Pole, Road, Pavement, Tree, SignSymbol, Fence, Car, Pedestrian, Bicyclist, Unlabelled])
            BG = [0,0,0]
            M = [0,255,0]
            label_colours = np.array([BG, M])
        for l in range(0,2):
            r[ind==l] = label_colours[l,0]
            g[ind==l] = label_colours[l,1]
            b[ind==l] = label_colours[l,2]
            r_gt[label==l] = label_colours[l,0]
            g_gt[label==l] = label_colours[l,1]
            b_gt[label==l] = label_colours[l,2]
        # we do not normalize
        rgb = np.zeros((ind.shape[0], ind.shape[1], 3))
        rgb[:,:,0] = r#/255.0
        rgb[:,:,1] = g#/255.0
        rgb[:,:,2] = b#/255.0
        rgb_gt = np.zeros((ind.shape[0], ind.shape[1], 3))
        rgb_gt[:,:,0] = r_gt#/255.0
        rgb_gt[:,:,1] = g_gt#/255.0
        rgb_gt[:,:,2] = b_gt#/255.0
     
        image = image#/255.0
     
        image = np.transpose(image, (1,2,0))
        output = np.transpose(output, (1,2,0))
        image = image[:,:,(2,1,0)]
     
     
        #scipy.misc.toimage(rgb, cmin=0.0, cmax=255).save(IMAGE_FILE+'_segnet.png') #保存文件
     
        cv2.imwrite(str(i%26)+'image.png', image.astype(np.uint8))
        cv2.imwrite(str(i%26)+'rgb_gt.png', rgb_gt.astype(np.uint8))
        cv2.imwrite(str(i%26)+'rgb.png', rgb.astype(np.uint8))
     
        
        #plt.figure()
        #plt.imshow(image,vmin=0, vmax=1)  #显示源文件
        #plt.figure()
        #plt.imshow(rgb_gt,vmin=0, vmax=1) #给的mask图片,如果测试的图片没有mask,可以随便放个图片列表,省的修改代码
        #plt.figure()
        #plt.imshow(rgb,vmin=0, vmax=1) # 预测图片
        #plt.show()
     
     
    print 'Success!'

四、mean IOU 评价的计算

#!/usr/bin/python
     
    import numpy as np
    from skimage import io
    import cv2
     
    def pixel_accuracy(eval_segm, gt_segm):
        '''
        sum_i(n_ii) / sum_i(t_i)
        '''
     
        check_size(eval_segm, gt_segm)
     
        cl, n_cl = extract_classes(gt_segm)
        eval_mask, gt_mask = extract_both_masks(eval_segm, gt_segm, cl, n_cl)
     
        sum_n_ii = 0
        sum_t_i  = 0
     
        for i, c in enumerate(cl):
            curr_eval_mask = eval_mask[i, :, :]
            curr_gt_mask = gt_mask[i, :, :]
     
            sum_n_ii += np.sum(np.logical_and(curr_eval_mask, curr_gt_mask))
            sum_t_i  += np.sum(curr_gt_mask)
     
        if (sum_t_i == 0):
            pixel_accuracy_ = 0
        else:
            pixel_accuracy_ = sum_n_ii / sum_t_i
     
        return pixel_accuracy_
     
    def mean_accuracy(eval_segm, gt_segm):
        '''
        (1/n_cl) sum_i(n_ii/t_i)
        '''
     
        check_size(eval_segm, gt_segm)
     
        cl, n_cl = extract_classes(gt_segm)
        eval_mask, gt_mask = extract_both_masks(eval_segm, gt_segm, cl, n_cl)
     
        accuracy = list([0]) * n_cl
     
        for i, c in enumerate(cl):
            curr_eval_mask = eval_mask[i, :, :]
            curr_gt_mask = gt_mask[i, :, :]
     
            n_ii = np.sum(np.logical_and(curr_eval_mask, curr_gt_mask))
            t_i  = np.sum(curr_gt_mask)
     
            if (t_i != 0):
                accuracy[i] = n_ii / t_i
     
        mean_accuracy_ = np.mean(accuracy)
        return mean_accuracy_
     
    def mean_IU(eval_segm, gt_segm):
        '''
        (1/n_cl) * sum_i(n_ii / (t_i + sum_j(n_ji) - n_ii))
        '''
     
        check_size(eval_segm, gt_segm)
     
        cl, n_cl   = union_classes(eval_segm, gt_segm)
        _, n_cl_gt = extract_classes(gt_segm)
        eval_mask, gt_mask = extract_both_masks(eval_segm, gt_segm, cl, n_cl)
     
        IU = list([0]) * n_cl
     
        for i, c in enumerate(cl):
            curr_eval_mask = eval_mask[i, :, :]
            curr_gt_mask = gt_mask[i, :, :]
     
            if (np.sum(curr_eval_mask) == 0) or (np.sum(curr_gt_mask) == 0):
                continue
     
            n_ii = np.sum(np.logical_and(curr_eval_mask, curr_gt_mask))
            t_i  = np.sum(curr_gt_mask)
            n_ij = np.sum(curr_eval_mask)
     
            IU[i] = n_ii / (t_i + n_ij - n_ii)
     
        mean_IU_ = np.sum(IU) / n_cl_gt
        return mean_IU_
     
    def frequency_weighted_IU(eval_segm, gt_segm):
        '''
        sum_k(t_k)^(-1) * sum_i((t_i*n_ii)/(t_i + sum_j(n_ji) - n_ii))
        '''
     
        check_size(eval_segm, gt_segm)
     
        cl, n_cl = union_classes(eval_segm, gt_segm)
        eval_mask, gt_mask = extract_both_masks(eval_segm, gt_segm, cl, n_cl)
     
        frequency_weighted_IU_ = list([0]) * n_cl
     
        for i, c in enumerate(cl):
            curr_eval_mask = eval_mask[i, :, :]
            curr_gt_mask = gt_mask[i, :, :]
     
            if (np.sum(curr_eval_mask) == 0) or (np.sum(curr_gt_mask) == 0):
                continue
     
            n_ii = np.sum(np.logical_and(curr_eval_mask, curr_gt_mask))
            t_i  = np.sum(curr_gt_mask)
            n_ij = np.sum(curr_eval_mask)
     
            frequency_weighted_IU_[i] = (t_i * n_ii) / (t_i + n_ij - n_ii)
     
        sum_k_t_k = get_pixel_area(eval_segm)
        
        frequency_weighted_IU_ = np.sum(frequency_weighted_IU_) / sum_k_t_k
        return frequency_weighted_IU_
     
    '''
    Auxiliary functions used during evaluation.
    '''
    def get_pixel_area(segm):
        return segm.shape[0] * segm.shape[1]
     
    def extract_both_masks(eval_segm, gt_segm, cl, n_cl):
        eval_mask = extract_masks(eval_segm, cl, n_cl)
        gt_mask   = extract_masks(gt_segm, cl, n_cl)
     
        return eval_mask, gt_mask
     
    def extract_classes(segm):
        cl = np.unique(segm)
        n_cl = len(cl)
     
        return cl, n_cl
     
    def union_classes(eval_segm, gt_segm):
        eval_cl, _ = extract_classes(eval_segm)
        gt_cl, _   = extract_classes(gt_segm)
     
        cl = np.union1d(eval_cl, gt_cl)
        n_cl = len(cl)
     
        return cl, n_cl
     
    def extract_masks(segm, cl, n_cl):
        h, w  = segm_size(segm)
        masks = np.zeros((n_cl, h, w))
     
        for i, c in enumerate(cl):
            masks[i, :, :] = segm == c
     
        return masks
     
    def segm_size(segm):
        try:
            height = segm.shape[0]
            width  = segm.shape[1]
        except IndexError:
            raise
     
        return height, width
     
    def check_size(eval_segm, gt_segm):
        h_e, w_e = segm_size(eval_segm)
        h_g, w_g = segm_size(gt_segm)
     
        if (h_e != h_g) or (w_e != w_g):
            raise EvalSegErr("DiffDim: Different dimensions of matrices!")
     
    '''
    Exceptions
    '''
    class EvalSegErr(Exception):
        def __init__(self, value):
            self.value = value
     
        def __str__(self):
            return repr(self.value)
    ###############now we do some eval.
    # test image only 1 image
    def eval_segm(preddir, gtdir):
        pred = io.imread(preddir, 1)
        gt = io.imread(gtdir, 1)
        pred = (pred ).astype(np.uint8)
        np.set_printoptions(threshold='nan')
        #print(pred[10:50,:])
        _, pred_th= cv2.threshold(pred, 0.0000000000000001, 1, cv2.THRESH_BINARY)
        #print(gt[10:50,:])
        gt = (gt).astype(np.uint8)
        _, gt_th= cv2.threshold(gt, 0.0000000000000001, 1, cv2.THRESH_BINARY)
        
        pixel_accu = pixel_accuracy(pred_th, gt_th)
        mean_accu = mean_accuracy(pred_th, gt_th)
        mean_iou = mean_IU(pred_th, gt_th)
        fw_iou = frequency_weighted_IU(pred_th, gt_th)
        print("pixel_accu is: ", pixel_accu)
        print("mean_accu is: ", mean_accu)
        print("mean_iou is: ",mean_iou)
        print("fw_iou is: ", fw_iou)
        return pixel_accu, mean_accu, mean_iou, fw_iou
    # test batch image
    def eval_batch(rootdir):
        res_sum = []
        pixel_accu = 0.0
        mean_accu = 0.0
        mean_iou = 0.0
        fw_iou = 0.0
        
        for i in range(16):
            preddir = rootdir + str(i)+"predicted.png"
            gtdir = rootdir + str(i) + "rgb_gt.png"
            print("===============%d==================", i)
            resperimage = eval_segm(preddir, gtdir)
            res_sum.append(resperimage)
        # compute avg eval metrics    
        print("==================avg eval seg=========================")
        len_res_sum = len(res_sum)
        for i in range(len_res_sum):
            pixel_accu += res_sum[i][0]
            mean_accu += res_sum[i][1]
            mean_iou += res_sum[i][2]
            fw_iou += res_sum[i][3]
        print("avg pixel_accu : ", pixel_accu / len_res_sum, "avg mean_accu : ", mean_accu / len_res_sum,\
        "avg mean_iou : ", mean_iou / len_res_sum, "avg fw_iou : ", fw_iou/len_res_sum)
        
    # get the contours of huizibiao    
    def get_contour(imagedir, preddir):
        #np.set_printoptions(threshold='nan')
     
        pred = io.imread(preddir, 1)
        print(pred.dtype)
        #print(pred[:,10:50])
        pred = (pred ).astype(np.uint8)
        #print(" ")
        image = io.imread(imagedir, 1) # because it is float64
        print(image.dtype)
        print(image.shape)
        #print(image[:,10:50])
        image = (image* 255).astype(np.uint8)    
        #cv2.imwrite("image.png",image)
        _, pred_th= cv2.threshold(pred, 0.0000000000000001, 1, cv2.THRESH_BINARY)
        contours, _ = cv2.findContours(pred_th,cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
        
        pred_contours = image
        
        for i in range(len(contours)):
            cv2.drawContours(pred_contours, contours[i], -1, (0, 255, 0), 1)
        
        return pred_contours
    # batch test contours of huizibiao
    def get_contour_batch(rootdir):
        for i in range(16):
            preddir = rootdir + str(i)+"predicted.png"
            imagedir = rootdir + str(i) + "image.png"
            print("=================================", i)
            cv2.imwrite(str(i)+"image_countours.png", get_contour(imagedir, preddir))
            
    if __name__ == "__main__":
        '''
        # test only one image.
        preddir = "/data/xxxxxx/caffe-segnet-cudnn/caffe-segnet-cudnn5/test_result/3_iter7700/2/predicted.png"
        gtdir = "/data/xxxxx/caffe-segnet-cudnn/caffe-segnet-cudnn5/test_result/3_iter7700/2/rgb_gt.png"
        eval_segm(preddir, gtdir)
        '''
        
        # test batch image
        #rootdir = "/data/xxxxxxxx/caffe-segnet-cudnn/caffe-segnet-cudnn5/test_result/116/iter17w/"
        rootdir = "/data/xxxxxxxx/caffe-segnet-cudnn/caffe-segnet-cudnn5/"
        #eval_batch(rootdir)
        
        #draw contours on the one  image
        #preddir = "/data/xxxxx/caffe-segnet-cudnn/caffe-segnet-cudnn5/0predicted.png"
        #imagedir = "/data/xxxxx/caffe-segnet-cudnn/caffe-segnet-cudnn5/0image.png"
        #get_contour(imagedir, preddir)
        
        #test batch
        get_contour_batch(rootdir)

楼主最后的mean IOU 是95.19%。

至此完毕。
---------------------  
作者:努力努力再努力tq  
来源:CSDN  
原文:https://blog.csdn.net/u012426298/article/details/81386817  
版权声明:本文为博主原创文章,转载请附上博文链接!

语义分割 | segnet 制作自己的数据,如何训练,如何测试,如何评价相关推荐

  1. 憨批的语义分割6——制作语义分割数据标签

    憨批的语义分割6--制作语义分割数据集 学习前言 制作工具Labelme Labelme的使用 标签文件内容 学习前言 有些小伙伴问怎么制作数据集,其实我也没有去制作过--所以也要学学啦, 制作工具L ...

  2. 从图片到dataframe——语义分割数据集制作全流程

    分享一下从原始图片,到标记图片,再到转换为python里的数据结构语义分割数据集制作全流程. 安装labelme labelme 是一个图形界面的图像标注软件,可以很方便地划分出多边形边界. 下面在w ...

  3. 【语义分割项目实战】Augmentor数据增强与U-Net的综合应用

    之前已经介绍过了数据增强工具Augmentor的使用 [语义分割项目实战]基于Augmentor工具的语义分割中常见几种数据增强方式(一)_Bill-QAQ-的博客-CSDN博客 以及简单的复现U-N ...

  4. 用于半监督语义分割的基于掩码的数据增强

    点击上方"小白学视觉",选择加"星标"或"置顶" 重磅干货,第一时间送达 小白导读 论文是学术研究的精华和未来发展的明灯.小白决心每天为大家 ...

  5. 憨批的语义分割4——pspnet模型详解以及训练自己的pspnet模型(划分斑马线)

    憨批的语义分割4--pspnet模型详解以及训练自己的pspnet模型(划分斑马线) 注意事项 学习前言 模型部分 什么是pspnet模型 pspnet模型的代码实现 1.主干模型Mobilenet. ...

  6. FCN制作自己的数据集、训练和测试 caffe

    原文:http://blog.csdn.net/zoro_lov3/article/details/74550735 FCN制作自己的数据集.训练和测试全流程 花了两三周的时间,在导师的催促下,把FC ...

  7. 【语义分割数据集制作】使用labelme制作自己的U-net语义分割数据集

      最近有一个工作需要使用Unet的语义分割结果作为上游任务,语义分割数据集需要自己制作,特此记录一下. 1.下载labelme   第一步当然是下载labelme工具,主要有两种方式:   1. 系 ...

  8. 【Python】mmSegmentation语义分割框架教程(自定义数据集、训练设定、数据增强)

    文章目录 0.mmSegmentation介绍 1.mmSegmentation基本框架 1.1.mmSegmentation的model设置 1.2.mmSegmentation的dataset设置 ...

  9. 图像语义分割样本制作——使用Matlab模块Image Labeler 标记样本

    在进行图像语义分割的时候,需要自己制作数据集,目前开源的标记软件很多,但个人觉得最好用的还是MATLAB中的Image Labeler,下面是简单的使用介绍. 1. 定义各类别说明: 2. 工具栏介绍 ...

最新文章

  1. HDU 5696 区间的价值 暴力
  2. [转]Windows的批处理脚本
  3. python的c语言扩展方法简介
  4. 学习数字电路必须知道的几种编码
  5. NILMTK在Windows下的安装教程
  6. 鸿蒙历程及路标没有适配手机,鸿蒙2.0来了?华为开发者大会时间确认:Mate40会不会首发?...
  7. 格林威治时间(Tue Jan 01 00:00:00 CST 2019)[ Date ]转化 为 [ 2019-01-01 10:10:10 ]
  8. linux检测端口是否开放的3种命令
  9. 观察者模式--java jdk中提供的支持
  10. yiic.php,PHP框架YII札记之1
  11. C++链接ODBC数据源:VS2013,Access
  12. (CVPR2019)图像语义分割(17)-DFANet:用于实时语义分割的深层特征聚合网络
  13. 3Com难道要走双品牌路线
  14. 【转载】SAP Smartform A5 针式打印机 打印格式横向问题
  15. 什么是SPA,有什么优缺点
  16. android手机浏览器测评,九款手机浏览器评测总结
  17. ssh框架简单练习----------个人信息管理系统的设计与实现
  18. 零基础学大数据现实吗?
  19. 【C++广度搜索入门】面积
  20. java 每日一练——英雄怪兽文字攻击实例(每步都有解释)

热门文章

  1. boost::random模块使用多精度类型测试所有与浮点相关的生成器和分布的测试程序
  2. boost::hana::maximum用法的测试程序
  3. boost::hana::apply用法的测试程序
  4. gdcm::ImageChangePhotometricInterpretation的测试程序
  5. GDCM:将gz文件转dcm文件测试程序
  6. DCMTK:DSRDocument类的测试程序
  7. VTK:可视化之FrogSlice
  8. VTK:Rendering之AmbientSpheres
  9. VTK:PolyData之MiscCellData
  10. VTK:图片之ImageVariance3D