pix2pixHD总结

deeplab部分学习参考DeepLab源码分析之deeplab_demo.ipynb,并做部分修改适应

论文:pix2pixHD 代码:GitHub

1. 测试样例数据,使用下载的G网络,参照./scripts内脚本

在datasets文件夹中,有一些示例Cityscapes测试图像

#!./scripts/test_1024p.shpython test.py --name label2city_1024p --netG local --ngf 32 --resize_or_crop none

测试结果将保存到html文件中:./results/label2city_1024p/test_latest/index.html中

2. 训练样例数据,参照./scripts内脚本

训练分辨率为1024x512的模型,

#!./scripts/train_512p.shpython train.py --name label2city_512p

要查看train结果,请查看中间结果./checkpoints/label2city_512p/web/index.html。

如果安装了tensorflow,则可以./checkpoints/label2city_512p/logs通过添加--tf_log到train脚本来查看tensorboard登录

至文件目录下,运行tensorboard --logdir=logs打开tensorboard。

使用多GPU进行train

#!./scripts/train_512p_multigpu.shpython train.py --name label2city_512p --batchSize 8 --gpu_ids 0,1,2,3,4,5,6,7

--batchSize大小与 --gpu_ids数目相同

3. 自建数据集(包含使用deeplab进行图像语义分割)

代码:GitHub

models:Checkpoints and frozen inference graphs.

从上述链接中下载源码及模型并执行

# deeplab_demo_test.pyimport os
from io import BytesIO
import tarfile
import tempfile
from six.moves import urllibfrom matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
import datetimeimport tensorflow as tf
from deeplab_demo import *LABEL_NAMES = np.asarray(['background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus','car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike','person', 'pottedplant', 'sheep', 'sofa', 'train', 'tv'
])FULL_LABEL_MAP = np.arange(len(LABEL_NAMES)).reshape(len(LABEL_NAMES), 1)
FULL_COLOR_MAP = label_to_color_image(FULL_LABEL_MAP)pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_pascal_trainval/frozen_inference_graph.pb'
# pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_mnv2_dm05_pascal_trainval/frozen_inference_graph.pb'
# pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_mnv2_dm05_pascal_trainaug/frozen_inference_graph.pb'
# pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_mnv2_ade20k_train/frozen_inference_graph.pb'
# pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_xception_ade20k_train/frozen_inference_graph.pb'MODEL = DeepLabModel(pb_path)
print('model loaded successfully!')# global starttimefor num in range(0,2):starttime = datetime.datetime.now()# IMAGE_PATH = 'E:/data/img/20190522/img/image%d.jpg' % num# OUT_PATH = 'E:/data/img/20190522/seg_img/seg_image%d.png' % numIMAGE_PATH = 'E:/data/img/test/img/image%d.jpg' % numOUT_PATH = 'E:/data/img/test/seg_map/seg_image%d.png' % num# print(IMAGE_PATH)path = IMAGE_PATHtry:oringnal_im = Image.open(path)print('running deeplab on image %s...' % path)# starttime = datetime.datetime.now()resized_im, seg_map = MODEL.run(oringnal_im)except IOError:print('Cannot retrieve image. Please check path: ' + path)# endtime = datetime.datetime.now()# print (endtime - starttime)seg_image = label_to_color_image(seg_map).astype(np.uint8)# im = Image.fromarray(seg_image)im = Image.fromarray(seg_map.astype(np.uint8))im.save(OUT_PATH)

以及

# -*- coding: utf-8 -*-
"""
DeepLab Demo.ipynb
https://blog.csdn.net/lifengcai_/article/details/80270409
Automatically generated by Colaboratory.
Original file is located at
https://colab.research.google.com/github/tensorflow/models/blob/master/research/deeplab/deeplab_demo.ipynb
# DeepLab Demo
This demo will demostrate the steps to run deeplab semantic segmentation model on sample input images.
"""#@title Imports
import os
from io import BytesIO
import tarfile
import tempfile
from six.moves import urllibfrom matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
import datetimeimport tensorflow as tf#@title Helper methods
global starttimeclass DeepLabModel(object):"""Class to load deeplab model and run inference."""INPUT_TENSOR_NAME = 'ImageTensor:0'OUTPUT_TENSOR_NAME = 'SemanticPredictions:0'INPUT_SIZE = 512FROZEN_GRAPH_NAME = 'frozen_inference_graph'def __init__(self, pb_path):"""Creates and loads pretrained deeplab model."""self.graph = tf.Graph()graph_def = Nonegraph_def = tf.GraphDef.FromString(open(pb_path, 'rb').read())#change1:input frozen_inference_graph.pbif graph_def is None:raise RuntimeError('Cannot find inference graph in tar archive.')with self.graph.as_default():tf.import_graph_def(graph_def, name='')self.sess = tf.Session(graph=self.graph)def run(self, image):"""Runs inference on a single image.Args:image: A PIL.Image object, raw input image.Returns:resized_image: RGB image resized from original input image.seg_map: Segmentation map of `resized_image`."""# 不resize 2019年5月21日09:43:40width, height = image.size# print(width, height)resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)# print(resize_ratio)# target_size = (int(resize_ratio * width), int(resize_ratio * height))target_size = (512,256)# print(target_size)resized_image = image.convert('RGB').resize(target_size, Image.ANTIALIAS)# resized_image = image.convert('RGB')batch_seg_map = self.sess.run(self.OUTPUT_TENSOR_NAME,feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})seg_map = batch_seg_map[0]return resized_image, seg_mapdef create_pascal_label_colormap():"""Creates a label colormap used in PASCAL VOC segmentation benchmark.Returns:A Colormap for visualizing segmentation results."""colormap = np.zeros((256, 3), dtype=int)ind = np.arange(256, dtype=int)for shift in reversed(range(8))   for channel in range(3):colormap[:, channel] |= ((ind >> channel) & 1) << shiftind >>= 3return colormapdef label_to_color_image(label):"""Adds color defined by the dataset colormap to the label.Args:label: A 2D array with integer type, storing the segmentation label.Returns:result: A 2D array with floating type. The element of the arrayis the color indexed by the corresponding element in the input labelto the PASCAL color map.Raises:ValueError: If label is not of rank 2 or its value is larger than colormap maximum entry."""if label.ndim != 2:raise ValueError('Expect 2-D input label')colormap = create_pascal_label_colormap()if np.max(label) >= len(colormap):raise ValueError('label value too large.')return colormap[label]def vis_segmentation(image, seg_map):"""Visualizes input image, segmentation map and overlay view."""plt.figure(figsize=(15, 5))grid_spec = gridspec.GridSpec(1, 4, width_ratios=[6, 6, 6, 1])plt.subplot(grid_spec[0])plt.imshow(image)plt.axis('off')plt.title('input image')plt.subplot(grid_spec[1])seg_image = label_to_color_image(seg_map).astype(np.uint8)plt.imshow(seg_image)plt.axis('off')plt.title('segmentation map')plt.subplot(grid_spec[2])plt.imshow(image)plt.imshow(seg_image, alpha=0.7)plt.axis('off')plt.title('segmentation overlay')unique_labels = np.unique(seg_map)ax = plt.subplot(grid_spec[3])plt.imshow(FULL_COLOR_MAP[unique_labels].astype(np.uint8), interpolation='nearest')ax.yaxis.tick_right()plt.yticks(range(len(unique_labels)), LABEL_NAMES[unique_labels])plt.xticks([], [])ax.tick_params(width=0.0)plt.grid('off')#Turn off axisplt.show()#image.save('C:/image1.png')im = Image.fromarray(seg_image)im.save('E:/data/img/seg_img/seg_image1.png')# LABEL_NAMES = np.asarray([
#     'background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus',
#     'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike',
#     'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tv'
# ])# FULL_LABEL_MAP = np.arange(len(LABEL_NAMES)).reshape(len(LABEL_NAMES), 1)
# FULL_COLOR_MAP = label_to_color_image(FULL_LABEL_MAP)# # pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_pascal_trainval/frozen_inference_graph.pb'
# pb_path='D:/workspace/ygd/Documents/GitHub/models/research/model_zoo_download/deeplabv3_pascal_trainval/frozen_inference_graph.pb'
# MODEL = DeepLabModel(pb_path)
# print('model loaded successfully!')"""
Run on sample images
Select one of sample images (leave `IMAGE_URL` empty) or feed any internet image url for inference.
Note that we are using single scale inference in the demo for fast computation, so the results may slightly differ from the visualizations in
[README](https://github.com/tensorflow/models/blob/master/research/deeplab/README.md),
which uses multi-scale and left-right flipped inputs.
"""def run_visualization(path):global starttime"""Inferences DeepLab model and visualizes result."""try:oringnal_im = Image.open(path)print('running deeplab on image %s...' % path)# starttime = datetime.datetime.now()resized_im, seg_map = MODEL.run(oringnal_im)except IOError:print('Cannot retrieve image. Please check path: ' + path)returnvis_segmentation(resized_im, seg_map)# IMAGE_PATH = 'E:/data/img/img/image126.jpg'
# run_visualization(IMAGE_PATH)
# endtime = datetime.datetime.now()
# print (endtime - starttime)

PS:还可以创建instance map,以区分同类中的不同个体

4. 编码特征 encode_features

预计算特征图并聚类,生成 .npy的文件,供后续读取

python encode_features.py --name butel_data20190516_feat_20190523 --dataroot /home/yangd/work/python/pix2pixHD_yangd/datasets/butel_data20190516_feat_20190523

5. 预计算特征图 precompute_feature_maps

预计算特征图并保存

python precompute_feature_maps.py --name butel_20190522_feat --dataroot ./datasets/train_20190520

6. 对自建数据集进行训练测试

训练

python train.py --name butel_data20190522_feat_20190524 --instance_feat --dataroot /home/yangd/work/python/pix2pixHD_yangd/datasets/butel_data20190522_feat_20190524 --gpu_ids 0,1 --batchSize 2 --tf_log --load_pretrain /home/yangd/work/python/pix2pixHD_yangd/checkpoints/butel_data20190516_feat_20190523 --niter 300 --niter_decay 300

测试

python test.py --name butel_data20190522_feat_20190524 --instance_feat --dataroot /home/yangd/work/python/pix2pixHD_yangd/datasets/butel_data20190522_feat_20190524 --use_encoded_image

常用参数

--name--gpu_ids--checkpoints_dir--batchSize--label_nc--dataroot--tf_log--no_instance--instance_feat--results_dir--how_many--use_encoded_image

附录

参数总结:

  • base_options

  # experiment specifics'--name', type=str, default='label2city', help='name of the experiment. It decides where to store samples and models'        '--gpu_ids', type=str, default='0', help='gpu ids: e.g. 0  0,1,2, 0,2. use -1 for CPU''--checkpoints_dir', type=str, default='./checkpoints', help='models are saved here''--model', type=str, default='pix2pixHD', help='which model to use''--norm', type=str, default='instance', help='instance normalization or batch normalization'        '--use_dropout', action='store_true', help='use dropout for the generator''--data_type', default=32, type=int, choices=[8, 16, 32], help="Supported data type i.e. 8, 16, 32 bit"'--verbose', action='store_true', default=False, help='toggles verbose''--fp16', action='store_true', default=False, help='train with AMP''--local_rank', type=int, default=0, help='local rank for distributed training'# input/output sizes       '--batchSize', type=int, default=1, help='input batch size''--loadSize', type=int, default=1024, help='scale images to this size''--fineSize', type=int, default=512, help='then crop to this size''--label_nc', type=int, default=35, help='# of input label channels''--input_nc', type=int, default=3, help='# of input image channels''--output_nc', type=int, default=3, help='# of output image channels'# for setting inputs'--dataroot', type=str, default='./datasets/cityscapes/''--resize_or_crop', type=str, default='scale_width', help='scaling and cropping of images at load time [resize_and_crop|crop|scale_width|scale_width_and_crop]''--serial_batches', action='store_true', help='if true, takes images in order to make batches, otherwise takes them randomly'        '--no_flip', action='store_true', help='if specified, do not flip the images for data argumentation''--nThreads', default=2, type=int, help='# threads for loading data'                '--max_dataset_size', type=int, default=float("inf", help='Maximum number of samples allowed per dataset. If the dataset directory contains more than max_dataset_size, only a subset is loaded.'# for displays'--display_winsize', type=int, default=512,  help='display window size''--tf_log', action='store_true', help='if specified, use tensorboard logging. Requires tensorflow installed'# for generator'--netG', type=str, default='global', help='selects model to use for netG''--ngf', type=int, default=64, help='# of gen filters in first conv layer''--n_downsample_global', type=int, default=4, help='number of downsampling layers in netG''--n_blocks_global', type=int, default=9, help='number of residual blocks in the global generator network''--n_blocks_local', type=int, default=3, help='number of residual blocks in the local enhancer network''--n_local_enhancers', type=int, default=1, help='number of local enhancers to use'        '--niter_fix_global', type=int, default=0, help='number of epochs that we only train the outmost local enhancer'  # for instance-wise features'--no_instance', action='store_true', help='if specified, do *not* add instance map as input'        '--instance_feat', action='store_true', help='if specified, add encoded instance features as input''--label_feat', action='store_true', help='if specified, add encoded label features as input'        '--feat_num', type=int, default=3, help='vector length for encoded features'        '--load_features', action='store_true', help='if specified, load precomputed feature maps''--n_downsample_E', type=int, default=4, help='# of downsampling layers in encoder''--nef', type=int, default=16, help='# of encoder filters in the first conv layer'        '--n_clusters', type=int, default=10, help='number of clusters for features'   
  • test_options

  '--ntest', type=int, default=float("inf", help='# of test examples.''--results_dir', type=str, default='./results/', help='saves results here.''--aspect_ratio', type=float, default=1.0, help='aspect ratio of result images''--phase', type=str, default='test', help='train, val, test, etc''--which_epoch', type=str, default='latest', help='which epoch to load? set to latest to use latest cached model''--how_many', type=int, default=50, help='how many test images to run'       '--cluster_path', type=str, default='features_clustered_010.npy', help='the path for clustered results of encoded features''--use_encoded_image', action='store_true', help='if specified, encode the real image to get the feature map'"--export_onnx", type=str, help="export ONNX model to a given file""--engine", type=str, help="run serialized TRT engine""--onnx", type=str, help="run ONNX model via TRT"        isTrain = False
  • train_options

# for displays'--display_freq', type=int, default=100, help='frequency of showing training results on screen''--print_freq', type=int, default=100, help='frequency of showing training results on console''--save_latest_freq', type=int, default=1000, help='frequency of saving the latest results''--save_epoch_freq', type=int, default=10, help='frequency of saving checkpoints at the end of epochs'        '--no_html', action='store_true', help='do not save intermediate training results to [opt.checkpoints_dir]/[opt.name]/web/''--debug', action='store_true', help='only do one epoch and displays at each iteration'# for training'--continue_train', action='store_true', help='continue training: load the latest model''--load_pretrain', type=str, default='', help='load the pretrained model from the specified location''--which_epoch', type=str, default='latest', help='which epoch to load? set to latest to use latest cached model''--phase', type=str, default='train', help='train, val, test, etc''--niter', type=int, default=100, help='# of iter at starting learning rate''--niter_decay', type=int, default=100, help='# of iter to linearly decay learning rate to zero''--beta1', type=float, default=0.5, help='momentum term of adam''--lr', type=float, default=0.0002, help='initial learning rate for adam'# for discriminators        '--num_D', type=int, default=2, help='number of discriminators to use''--n_layers_D', type=int, default=3, help='only used if which_model_netD==n_layers''--ndf', type=int, default=64, help='# of discrim filters in first conv layer'    '--lambda_feat', type=float, default=10.0, help='weight for feature matching loss'                '--no_ganFeat_loss', action='store_true', help='if specified, do *not* use discriminator feature matching loss''--no_vgg_loss', action='store_true', help='if specified, do *not* use VGG feature matching loss'        '--no_lsgan', action='store_true', help='do *not* use least square GAN, if false, use vanilla GAN''--pool_size', type=int, default=0, help='the size of image buffer that stores previously generated images'isTrain = True

pix2pixHD总结相关推荐

  1. 如何判断模糊图像_图像翻译三部曲:pix2pix, pix2pixHD, vid2vid

    所谓图像翻译,指从一副图像到另一副图像的转换.可以类比机器翻译,一种语言转换为另一种语言.下图就是一些典型的图像翻译任务:比如语义分割图转换为真实街景图,灰色图转换为彩色图,白天转换为黑夜...... ...

  2. pix2pixhd_一文读懂GAN, pix2pix, CycleGAN和pix2pixHD

    人员信息 主讲嘉宾 姓名:朱俊彦(Jun-Yan Zhu) 现状:麻省理工学院博士后(PostDoc at MIT),计算机科学与人工智能实验室(Computer Science and Artifi ...

  3. 一文读懂GAN, pix2pix, CycleGAN和pix2pixHD

    人员信息 主讲嘉宾 姓名:朱俊彦(Jun-Yan Zhu) 现状:麻省理工学院博士后(PostDoc at MIT),计算机科学与人工智能实验室(Computer Science and Artifi ...

  4. pix2pix, pix2pixHD, vid2vid

    512*512 https://gitee.com/jacke121/pix2pixHD_ACM 1060 batch-size 1也训练不起来. https://github.com/deepglu ...

  5. pix2pixHD笔记

    pix2pixHD是pix2pix的重要升级,可以实现高分辨率图像生成和图片的语义编辑.对于一个生成对抗网络(GAN),学习的关键就是理解生成器.判别器和损失函数这三部分.pix2pixHD的生成器和 ...

  6. pix2pix, CycleGAN和pix2pixHD(没有公式,容易理解)

    pix2pix:有条件地使用用户输入,它使用成对的数据(paired data)进行训练. CycleGAN:使用不成对的数据(unpaired data)的就能训练. pix2pixHD:生成高分辨 ...

  7. pix2pixhd_基于pix2pixHD的行人图像生成

    基于 pix2pixHD 的行人图像生成 程 平 [摘 要] 摘要:本文提出了一个基于行人姿态 mask 利用条件生成对抗网络来 生成逼真的行人图像的方法,使用了一种最新的 adversarial l ...

  8. pix2pixHD:High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

    转载自:https://www.jianshu.com/p/eb29a264c71a 论文:pix2pixHD 代码:GitHub 这篇paper作为pix2pix(参见前一篇博客)的改进版本,如其名 ...

  9. 如何用深度学习生成图片(GAN, pix2pix, CycleGAN和pix2pixHD)

    本文翻译.总结自朱俊彦的线上报告,主要讲了如何用机器学习生成图片. 来源:Games2018 Webinar 64期 :Siggraph 2018优秀博士论文报告 人员信息 主讲嘉宾 姓名:朱俊彦(J ...

  10. 【生成对抗网络 论文泛读】……pix2pix pix2pixhd……

    文章目录 前言 Pix2pix 简介 核心思想 Pix2pixHD 升级1 升级2 升级3 升级4 前言 这两篇论文放在一起说. pix2pix:点我下载 pix2pixhd:点我下载 Pix2pix ...

最新文章

  1. 如何与您的经理和上层人员进行有效沟通
  2. C# http post 地址
  3. Python代码规范和命名规范
  4. BZOJ 1878 HH的项链
  5. 使用Forms Authentication实现用户注册、登录 (三)用户实体替换
  6. 获取两个数据的交集_MySQL交集和差集的实现方法
  7. 微软大数据_我对Microsoft的数据科学采访
  8. Kafka刚开启就秒退
  9. 生产数据库更新忙,没有超时
  10. selenium 环境搭建
  11. 史上最强大的僵尸网络 Dark_nexus 横空出世
  12. hp服务器系统如何用u盘恢复,软硬件技巧 篇三:HP战66之恢复U盘制作,以及恢复系统之体验感想...
  13. excel批量插入图片url显示方法
  14. java基础案例7-4升级日记本
  15. 腾讯云服务器的功能与优势体现在哪里?为新手选择服务器提供参考
  16. 关于打印机共享和连接问题
  17. 代理服务器CCProxy基本设置
  18. Python学习笔记 第四天
  19. WIN10安装Debugging Tools for Windows
  20. KubeEdge+Fabedge集成环境搭建教程

热门文章

  1. warcraft3Viewer模型导入到3dsmax到Unity
  2. 北京大学软件与微电子学院嵌入式系统工程系
  3. VS2010 SP1安装失败
  4. 10大亮点解读--极通EWEBS4.0
  5. 赵伟功老师 管理系统提升专家
  6. 算法设计——电路布线问题(动态规划)
  7. 计算机科学检索课题,文献检索报告课题.docx
  8. c语言二fseek从文件头移动_C语言fseek函数
  9. 坦克大战源代码java_Java版坦克大战游戏源码示例
  10. Java开发的文字RPG游戏,代码开源