pix2pixHD总结

deeplab部分学习参考DeepLab源码分析之deeplab_demo.ipynb，并做部分修改适应

论文：pix2pixHD 代码：GitHub

1. 测试样例数据，使用下载的G网络，参照./scripts内脚本

在datasets文件夹中，有一些示例Cityscapes测试图像

#!./scripts/test_1024p.shpython test.py --name label2city_1024p --netG local --ngf 32 --resize_or_crop none

测试结果将保存到html文件中：./results/label2city_1024p/test_latest/index.html中

2. 训练样例数据，参照./scripts内脚本

训练分辨率为1024x512的模型，

#!./scripts/train_512p.shpython train.py --name label2city_512p

要查看train结果，请查看中间结果./checkpoints/label2city_512p/web/index.html。

如果安装了tensorflow，则可以./checkpoints/label2city_512p/logs通过添加--tf_log到train脚本来查看tensorboard登录

至文件目录下，运行tensorboard --logdir=logs打开tensorboard。

使用多GPU进行train

＃！./scripts/train_512p_multigpu.shpython train.py --name label2city_512p --batchSize 8 --gpu_ids 0,1,2,3,4,5,6,7

--batchSize大小与 --gpu_ids数目相同

3. 自建数据集（包含使用deeplab进行图像语义分割）

代码：GitHub

models：Checkpoints and frozen inference graphs.

从上述链接中下载源码及模型并执行

# deeplab_demo_test.pyimport os
from io import BytesIO
import tarfile
import tempfile
from six.moves import urllibfrom matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
import datetimeimport tensorflow as tf
from deeplab_demo import *LABEL_NAMES = np.asarray(['background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus','car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike','person', 'pottedplant', 'sheep', 'sofa', 'train', 'tv'
])FULL_LABEL_MAP = np.arange(len(LABEL_NAMES)).reshape(len(LABEL_NAMES), 1)
FULL_COLOR_MAP = label_to_color_image(FULL_LABEL_MAP)pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_pascal_trainval/frozen_inference_graph.pb'
# pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_mnv2_dm05_pascal_trainval/frozen_inference_graph.pb'
# pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_mnv2_dm05_pascal_trainaug/frozen_inference_graph.pb'
# pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_mnv2_ade20k_train/frozen_inference_graph.pb'
# pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_xception_ade20k_train/frozen_inference_graph.pb'MODEL = DeepLabModel(pb_path)
print('model loaded successfully!')# global starttimefor num in range(0,2):starttime = datetime.datetime.now()# IMAGE_PATH = 'E:/data/img/20190522/img/image%d.jpg' % num# OUT_PATH = 'E:/data/img/20190522/seg_img/seg_image%d.png' % numIMAGE_PATH = 'E:/data/img/test/img/image%d.jpg' % numOUT_PATH = 'E:/data/img/test/seg_map/seg_image%d.png' % num# print(IMAGE_PATH)path = IMAGE_PATHtry:oringnal_im = Image.open(path)print('running deeplab on image %s...' % path)# starttime = datetime.datetime.now()resized_im, seg_map = MODEL.run(oringnal_im)except IOError:print('Cannot retrieve image. Please check path: ' + path)# endtime = datetime.datetime.now()# print (endtime - starttime)seg_image = label_to_color_image(seg_map).astype(np.uint8)# im = Image.fromarray(seg_image)im = Image.fromarray(seg_map.astype(np.uint8))im.save(OUT_PATH)

以及

# -*- coding: utf-8 -*-
"""
DeepLab Demo.ipynb
https://blog.csdn.net/lifengcai_/article/details/80270409
Automatically generated by Colaboratory.
Original file is located at
https://colab.research.google.com/github/tensorflow/models/blob/master/research/deeplab/deeplab_demo.ipynb
# DeepLab Demo
This demo will demostrate the steps to run deeplab semantic segmentation model on sample input images.
"""#@title Imports
import os
from io import BytesIO
import tarfile
import tempfile
from six.moves import urllibfrom matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
import datetimeimport tensorflow as tf#@title Helper methods
global starttimeclass DeepLabModel(object):"""Class to load deeplab model and run inference."""INPUT_TENSOR_NAME = 'ImageTensor:0'OUTPUT_TENSOR_NAME = 'SemanticPredictions:0'INPUT_SIZE = 512FROZEN_GRAPH_NAME = 'frozen_inference_graph'def __init__(self, pb_path):"""Creates and loads pretrained deeplab model."""self.graph = tf.Graph()graph_def = Nonegraph_def = tf.GraphDef.FromString(open(pb_path, 'rb').read())#change1:input frozen_inference_graph.pbif graph_def is None:raise RuntimeError('Cannot find inference graph in tar archive.')with self.graph.as_default():tf.import_graph_def(graph_def, name='')self.sess = tf.Session(graph=self.graph)def run(self, image):"""Runs inference on a single image.Args:image: A PIL.Image object, raw input image.Returns:resized_image: RGB image resized from original input image.seg_map: Segmentation map of `resized_image`."""# 不resize 2019年5月21日09:43:40width, height = image.size# print(width, height)resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)# print(resize_ratio)# target_size = (int(resize_ratio * width), int(resize_ratio * height))target_size = (512,256)# print(target_size)resized_image = image.convert('RGB').resize(target_size, Image.ANTIALIAS)# resized_image = image.convert('RGB')batch_seg_map = self.sess.run(self.OUTPUT_TENSOR_NAME,feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})seg_map = batch_seg_map[0]return resized_image, seg_mapdef create_pascal_label_colormap():"""Creates a label colormap used in PASCAL VOC segmentation benchmark.Returns:A Colormap for visualizing segmentation results."""colormap = np.zeros((256, 3), dtype=int)ind = np.arange(256, dtype=int)for shift in reversed(range(8))   for channel in range(3):colormap[:, channel] |= ((ind >> channel) & 1) << shiftind >>= 3return colormapdef label_to_color_image(label):"""Adds color defined by the dataset colormap to the label.Args:label: A 2D array with integer type, storing the segmentation label.Returns:result: A 2D array with floating type. The element of the arrayis the color indexed by the corresponding element in the input labelto the PASCAL color map.Raises:ValueError: If label is not of rank 2 or its value is larger than colormap maximum entry."""if label.ndim != 2:raise ValueError('Expect 2-D input label')colormap = create_pascal_label_colormap()if np.max(label) >= len(colormap):raise ValueError('label value too large.')return colormap[label]def vis_segmentation(image, seg_map):"""Visualizes input image, segmentation map and overlay view."""plt.figure(figsize=(15, 5))grid_spec = gridspec.GridSpec(1, 4, width_ratios=[6, 6, 6, 1])plt.subplot(grid_spec[0])plt.imshow(image)plt.axis('off')plt.title('input image')plt.subplot(grid_spec[1])seg_image = label_to_color_image(seg_map).astype(np.uint8)plt.imshow(seg_image)plt.axis('off')plt.title('segmentation map')plt.subplot(grid_spec[2])plt.imshow(image)plt.imshow(seg_image, alpha=0.7)plt.axis('off')plt.title('segmentation overlay')unique_labels = np.unique(seg_map)ax = plt.subplot(grid_spec[3])plt.imshow(FULL_COLOR_MAP[unique_labels].astype(np.uint8), interpolation='nearest')ax.yaxis.tick_right()plt.yticks(range(len(unique_labels)), LABEL_NAMES[unique_labels])plt.xticks([], [])ax.tick_params(width=0.0)plt.grid('off')#Turn off axisplt.show()#image.save('C:/image1.png')im = Image.fromarray(seg_image)im.save('E:/data/img/seg_img/seg_image1.png')# LABEL_NAMES = np.asarray([
#     'background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus',
#     'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike',
#     'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tv'
# ])# FULL_LABEL_MAP = np.arange(len(LABEL_NAMES)).reshape(len(LABEL_NAMES), 1)
# FULL_COLOR_MAP = label_to_color_image(FULL_LABEL_MAP)# # pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_pascal_trainval/frozen_inference_graph.pb'
# pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_pascal_trainval/frozen_inference_graph.pb'
# MODEL = DeepLabModel(pb_path)
# print('model loaded successfully!')"""
Run on sample images
Select one of sample images (leave `IMAGE_URL` empty) or feed any internet image url for inference.
Note that we are using single scale inference in the demo for fast computation, so the results may slightly differ from the visualizations in
[README](https://github.com/tensorflow/models/blob/master/research/deeplab/README.md),
which uses multi-scale and left-right flipped inputs.
"""def run_visualization(path):global starttime"""Inferences DeepLab model and visualizes result."""try:oringnal_im = Image.open(path)print('running deeplab on image %s...' % path)# starttime = datetime.datetime.now()resized_im, seg_map = MODEL.run(oringnal_im)except IOError:print('Cannot retrieve image. Please check path: ' + path)returnvis_segmentation(resized_im, seg_map)# IMAGE_PATH = 'E:/data/img/img/image126.jpg'
# run_visualization(IMAGE_PATH)
# endtime = datetime.datetime.now()
# print (endtime - starttime)

PS：还可以创建instance map，以区分同类中的不同个体

4. 编码特征 encode_features

预计算特征图并聚类，生成 .npy的文件，供后续读取

python encode_features.py --name butel_data20190516_feat_20190523 --dataroot /home/yangd/work/python/pix2pixHD_yangd/datasets/butel_data20190516_feat_20190523

5. 预计算特征图 precompute_feature_maps

预计算特征图并保存

python precompute_feature_maps.py --name butel_20190522_feat --dataroot ./datasets/train_20190520

6. 对自建数据集进行训练测试

训练

python train.py --name butel_data20190522_feat_20190524 --instance_feat --dataroot /home/yangd/work/python/pix2pixHD_yangd/datasets/butel_data20190522_feat_20190524 --gpu_ids 0,1 --batchSize 2 --tf_log --load_pretrain /home/yangd/work/python/pix2pixHD_yangd/checkpoints/butel_data20190516_feat_20190523 --niter 300 --niter_decay 300

测试

python test.py --name butel_data20190522_feat_20190524 --instance_feat --dataroot /home/yangd/work/python/pix2pixHD_yangd/datasets/butel_data20190522_feat_20190524 --use_encoded_image

常用参数

--name--gpu_ids--checkpoints_dir--batchSize--label_nc--dataroot--tf_log--no_instance--instance_feat--results_dir--how_many--use_encoded_image

附录

参数总结：

base_options

  # experiment specifics'--name', type=str, default='label2city', help='name of the experiment. It decides where to store samples and models'        '--gpu_ids', type=str, default='0', help='gpu ids: e.g. 0  0,1,2, 0,2. use -1 for CPU''--checkpoints_dir', type=str, default='./checkpoints', help='models are saved here''--model', type=str, default='pix2pixHD', help='which model to use''--norm', type=str, default='instance', help='instance normalization or batch normalization'        '--use_dropout', action='store_true', help='use dropout for the generator''--data_type', default=32, type=int, choices=[8, 16, 32], help="Supported data type i.e. 8, 16, 32 bit"'--verbose', action='store_true', default=False, help='toggles verbose''--fp16', action='store_true', default=False, help='train with AMP''--local_rank', type=int, default=0, help='local rank for distributed training'# input/output sizes       '--batchSize', type=int, default=1, help='input batch size''--loadSize', type=int, default=1024, help='scale images to this size''--fineSize', type=int, default=512, help='then crop to this size''--label_nc', type=int, default=35, help='# of input label channels''--input_nc', type=int, default=3, help='# of input image channels''--output_nc', type=int, default=3, help='# of output image channels'# for setting inputs'--dataroot', type=str, default='./datasets/cityscapes/''--resize_or_crop', type=str, default='scale_width', help='scaling and cropping of images at load time [resize_and_crop|crop|scale_width|scale_width_and_crop]''--serial_batches', action='store_true', help='if true, takes images in order to make batches, otherwise takes them randomly'        '--no_flip', action='store_true', help='if specified, do not flip the images for data argumentation''--nThreads', default=2, type=int, help='# threads for loading data'                '--max_dataset_size', type=int, default=float("inf", help='Maximum number of samples allowed per dataset. If the dataset directory contains more than max_dataset_size, only a subset is loaded.'# for displays'--display_winsize', type=int, default=512,  help='display window size''--tf_log', action='store_true', help='if specified, use tensorboard logging. Requires tensorflow installed'# for generator'--netG', type=str, default='global', help='selects model to use for netG''--ngf', type=int, default=64, help='# of gen filters in first conv layer''--n_downsample_global', type=int, default=4, help='number of downsampling layers in netG''--n_blocks_global', type=int, default=9, help='number of residual blocks in the global generator network''--n_blocks_local', type=int, default=3, help='number of residual blocks in the local enhancer network''--n_local_enhancers', type=int, default=1, help='number of local enhancers to use'        '--niter_fix_global', type=int, default=0, help='number of epochs that we only train the outmost local enhancer'  # for instance-wise features'--no_instance', action='store_true', help='if specified, do *not* add instance map as input'        '--instance_feat', action='store_true', help='if specified, add encoded instance features as input''--label_feat', action='store_true', help='if specified, add encoded label features as input'        '--feat_num', type=int, default=3, help='vector length for encoded features'        '--load_features', action='store_true', help='if specified, load precomputed feature maps''--n_downsample_E', type=int, default=4, help='# of downsampling layers in encoder''--nef', type=int, default=16, help='# of encoder filters in the first conv layer'        '--n_clusters', type=int, default=10, help='number of clusters for features'

test_options

  '--ntest', type=int, default=float("inf", help='# of test examples.''--results_dir', type=str, default='./results/', help='saves results here.''--aspect_ratio', type=float, default=1.0, help='aspect ratio of result images''--phase', type=str, default='test', help='train, val, test, etc''--which_epoch', type=str, default='latest', help='which epoch to load? set to latest to use latest cached model''--how_many', type=int, default=50, help='how many test images to run'       '--cluster_path', type=str, default='features_clustered_010.npy', help='the path for clustered results of encoded features''--use_encoded_image', action='store_true', help='if specified, encode the real image to get the feature map'"--export_onnx", type=str, help="export ONNX model to a given file""--engine", type=str, help="run serialized TRT engine""--onnx", type=str, help="run ONNX model via TRT"        isTrain = False

train_options

# for displays'--display_freq', type=int, default=100, help='frequency of showing training results on screen''--print_freq', type=int, default=100, help='frequency of showing training results on console''--save_latest_freq', type=int, default=1000, help='frequency of saving the latest results''--save_epoch_freq', type=int, default=10, help='frequency of saving checkpoints at the end of epochs'        '--no_html', action='store_true', help='do not save intermediate training results to [opt.checkpoints_dir]/[opt.name]/web/''--debug', action='store_true', help='only do one epoch and displays at each iteration'# for training'--continue_train', action='store_true', help='continue training: load the latest model''--load_pretrain', type=str, default='', help='load the pretrained model from the specified location''--which_epoch', type=str, default='latest', help='which epoch to load? set to latest to use latest cached model''--phase', type=str, default='train', help='train, val, test, etc''--niter', type=int, default=100, help='# of iter at starting learning rate''--niter_decay', type=int, default=100, help='# of iter to linearly decay learning rate to zero''--beta1', type=float, default=0.5, help='momentum term of adam''--lr', type=float, default=0.0002, help='initial learning rate for adam'# for discriminators        '--num_D', type=int, default=2, help='number of discriminators to use''--n_layers_D', type=int, default=3, help='only used if which_model_netD==n_layers''--ndf', type=int, default=64, help='# of discrim filters in first conv layer'    '--lambda_feat', type=float, default=10.0, help='weight for feature matching loss'                '--no_ganFeat_loss', action='store_true', help='if specified, do *not* use discriminator feature matching loss''--no_vgg_loss', action='store_true', help='if specified, do *not* use VGG feature matching loss'        '--no_lsgan', action='store_true', help='do *not* use least square GAN, if false, use vanilla GAN''--pool_size', type=int, default=0, help='the size of image buffer that stores previously generated images'isTrain = True

pix2pixHD总结相关推荐

如何判断模糊图像_图像翻译三部曲：pix2pix, pix2pixHD, vid2vid
所谓图像翻译,指从一副图像到另一副图像的转换.可以类比机器翻译,一种语言转换为另一种语言.下图就是一些典型的图像翻译任务:比如语义分割图转换为真实街景图,灰色图转换为彩色图,白天转换为黑夜...... ...
pix2pixhd_一文读懂GAN, pix2pix, CycleGAN和pix2pixHD
人员信息主讲嘉宾姓名:朱俊彦(Jun-Yan Zhu) 现状:麻省理工学院博士后(PostDoc at MIT),计算机科学与人工智能实验室(Computer Science and Artifi ...
一文读懂GAN, pix2pix, CycleGAN和pix2pixHD
人员信息主讲嘉宾姓名:朱俊彦(Jun-Yan Zhu) 现状:麻省理工学院博士后(PostDoc at MIT),计算机科学与人工智能实验室(Computer Science and Artifi ...
pix2pix, pix2pixHD, vid2vid
512*512 https://gitee.com/jacke121/pix2pixHD_ACM 1060 batch-size 1也训练不起来. https://github.com/deepglu ...
pix2pixHD笔记
pix2pixHD是pix2pix的重要升级,可以实现高分辨率图像生成和图片的语义编辑.对于一个生成对抗网络(GAN),学习的关键就是理解生成器.判别器和损失函数这三部分.pix2pixHD的生成器和 ...
pix2pix, CycleGAN和pix2pixHD（没有公式，容易理解）
pix2pix:有条件地使用用户输入,它使用成对的数据(paired data)进行训练. CycleGAN:使用不成对的数据(unpaired data)的就能训练. pix2pixHD:生成高分辨 ...
pix2pixhd_基于pix2pixHD的行人图像生成
基于 pix2pixHD 的行人图像生成程平 [摘要] 摘要:本文提出了一个基于行人姿态 mask 利用条件生成对抗网络来生成逼真的行人图像的方法,使用了一种最新的 adversarial l ...
pix2pixHD:High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
转载自:https://www.jianshu.com/p/eb29a264c71a 论文:pix2pixHD 代码:GitHub 这篇paper作为pix2pix(参见前一篇博客)的改进版本,如其名 ...
如何用深度学习生成图片（GAN, pix2pix, CycleGAN和pix2pixHD）
本文翻译.总结自朱俊彦的线上报告,主要讲了如何用机器学习生成图片. 来源:Games2018 Webinar 64期 :Siggraph 2018优秀博士论文报告人员信息主讲嘉宾姓名:朱俊彦(J ...
【生成对抗网络论文泛读】……pix2pix pix2pixhd……
文章目录前言 Pix2pix 简介核心思想 Pix2pixHD 升级1 升级2 升级3 升级4 前言这两篇论文放在一起说. pix2pix:点我下载 pix2pixhd:点我下载 Pix2pix ...

pix2pixHD总结

pix2pixHD总结相关推荐

最新文章

热门文章