Tensorflow实现Neural Style图像风格转移

刚开始接触TensorFlow，实践个小项目，也参考了一下其他博主的文章，希望大家提出宝贵意见。

代码详解：

TensorFlow版本的源码主要包含了三个文件：neural_style.py, stylize.py和 vgg.py。

neural_style.py：外部接口函数，定义了函数的主要参数以及部分参数的默认值，包含对图像的读取和存贮，对输入图像进行resize，权值分配等操作，并将参数以及resize的图片传入stylize.py中。

stylize.py：核心代码，包含了训练、优化等过程。

vgg.py：定义了网络模型以及相关的运算。

同时要把imagenet-vgg-verydeep-19文件放在以上三个py文件的同一个文件夹下，可在这下载（http://www.vlfeat.org/matconvnet/models/beta16/imagenet-vgg-verydeep-19.mat）
我们可以使用下面的代码vgg.py读取VGG-19神经网络，用于构造Neural Style模型。

import tensorflow as tf
import numpy as np
import scipy.io#需要使用神经网络层
VGG19_LAYERS = ('conv1_1', 'relu1_1', 'conv1_2', 'relu1_2', 'pool1','conv2_1', 'relu2_1', 'conv2_2', 'relu2_2', 'pool2','conv3_1', 'relu3_1', 'conv3_2', 'relu3_2', 'conv3_3','relu3_3', 'conv3_4', 'relu3_4', 'pool3','conv4_1', 'relu4_1', 'conv4_2', 'relu4_2', 'conv4_3','relu4_3', 'conv4_4', 'relu4_4', 'pool4','conv5_1', 'relu5_1', 'conv5_2', 'relu5_2', 'conv5_3','relu5_3', 'conv5_4', 'relu5_4'
)##我们需要的信息是每层神经网络的kernels和bias
def load_net(data_path):data = scipy.io.loadmat(data_path)if not all(i in data for i in ('layers', 'classes', 'normalization')):raise ValueError("You're using the wrong VGG19 data. Please follow the instructions in the README to download the correct data.")mean = data['normalization'][0][0][0]mean_pixel = np.mean(mean, axis=(0, 1))weights = data['layers'][0]return weights, mean_pixeldef net_preloaded(weights, input_image, pooling):net = {}current = input_imagefor i, name in enumerate(VGG19_LAYERS):kind = name[:4]if kind == 'conv':kernels, bias = weights[i][0][0][0][0]kernels = np.transpose(kernels, (1, 0, 2, 3))bias = bias.reshape(-1)current = _conv_layer(current, kernels, bias)elif kind == 'relu':current = tf.nn.relu(current)elif kind == 'pool':current = _pool_layer(current, pooling)net[name] = currentassert len(net) == len(VGG19_LAYERS)return netdef _conv_layer(input, weights, bias):conv = tf.nn.conv2d(input, tf.constant(weights), strides=(1, 1, 1, 1),padding='SAME')return tf.nn.bias_add(conv, bias)def _pool_layer(input, pooling):if pooling == 'avg':return tf.nn.avg_pool(input, ksize=(1, 2, 2, 1), strides=(1, 2, 2, 1),padding='SAME')else:return tf.nn.max_pool(input, ksize=(1, 2, 2, 1), strides=(1, 2, 2, 1),padding='SAME')def preprocess(image, mean_pixel):return image - mean_pixeldef unprocess(image, mean_pixel):return image + mean_pixel

在neural_style.py中我们可以看到，定义了非常长多的参数和外部接口。

import osimport numpy as np
import scipy.miscfrom stylize import stylizeimport math
from argparse import ArgumentParserfrom PIL import Image# default arguments
CONTENT_WEIGHT = 5e0
CONTENT_WEIGHT_BLEND = 1
STYLE_WEIGHT = 5e2
TV_WEIGHT = 1e2
STYLE_LAYER_WEIGHT_EXP = 1
LEARNING_RATE = 1e1
BETA1 = 0.9
BETA2 = 0.999
EPSILON = 1e-08
STYLE_SCALE = 1.0
ITERATIONS = 10
VGG_PATH = 'imagenet-vgg-verydeep-19.mat'
POOLING = 'max'def build_parser():parser = ArgumentParser()parser.add_argument('--content',dest='content', help='content image',metavar='CONTENT', required=True)parser.add_argument('--styles',dest='styles',nargs='+', help='one or more style images',metavar='STYLE', required=True)parser.add_argument('--output',dest='output', help='output path',metavar='OUTPUT', required=True)parser.add_argument('--iterations', type=int,dest='iterations', help='iterations (default %(default)s)',metavar='ITERATIONS', default=ITERATIONS)parser.add_argument('--print-iterations', type=int,dest='print_iterations', help='statistics printing frequency',metavar='PRINT_ITERATIONS')parser.add_argument('--checkpoint-output',dest='checkpoint_output', help='checkpoint output format, e.g. output%%s.jpg',metavar='OUTPUT')parser.add_argument('--checkpoint-iterations', type=int,dest='checkpoint_iterations', help='checkpoint frequency',metavar='CHECKPOINT_ITERATIONS')parser.add_argument('--width', type=int,dest='width', help='output width',metavar='WIDTH')parser.add_argument('--style-scales', type=float,dest='style_scales',nargs='+', help='one or more style scales',metavar='STYLE_SCALE')parser.add_argument('--network',dest='network', help='path to network parameters (default %(default)s)',metavar='VGG_PATH', default=VGG_PATH)parser.add_argument('--content-weight-blend', type=float,dest='content_weight_blend', help='content weight blend, conv4_2 * blend + conv5_2 * (1-blend) (default %(default)s)',metavar='CONTENT_WEIGHT_BLEND', default=CONTENT_WEIGHT_BLEND)parser.add_argument('--content-weight', type=float,dest='content_weight', help='content weight (default %(default)s)',metavar='CONTENT_WEIGHT', default=CONTENT_WEIGHT)parser.add_argument('--style-weight', type=float,dest='style_weight', help='style weight (default %(default)s)',metavar='STYLE_WEIGHT', default=STYLE_WEIGHT)parser.add_argument('--style-layer-weight-exp', type=float,dest='style_layer_weight_exp', help='style layer weight exponentional increase - weight(layer<n+1>) = weight_exp*weight(layer<n>) (default %(default)s)',metavar='STYLE_LAYER_WEIGHT_EXP', default=STYLE_LAYER_WEIGHT_EXP)parser.add_argument('--style-blend-weights', type=float,dest='style_blend_weights', help='style blending weights',nargs='+', metavar='STYLE_BLEND_WEIGHT')parser.add_argument('--tv-weight', type=float,dest='tv_weight', help='total variation regularization weight (default %(default)s)',metavar='TV_WEIGHT', default=TV_WEIGHT)parser.add_argument('--learning-rate', type=float,dest='learning_rate', help='learning rate (default %(default)s)',metavar='LEARNING_RATE', default=LEARNING_RATE)parser.add_argument('--beta1', type=float,dest='beta1', help='Adam: beta1 parameter (default %(default)s)',metavar='BETA1', default=BETA1)parser.add_argument('--beta2', type=float,dest='beta2', help='Adam: beta2 parameter (default %(default)s)',metavar='BETA2', default=BETA2)parser.add_argument('--eps', type=float,dest='epsilon', help='Adam: epsilon parameter (default %(default)s)',metavar='EPSILON', default=EPSILON)parser.add_argument('--initial',dest='initial', help='initial image',metavar='INITIAL')parser.add_argument('--initial-noiseblend', type=float,dest='initial_noiseblend', help='ratio of blending initial image with normalized noise (if no initial image specified, content image is used) (default %(default)s)',metavar='INITIAL_NOISEBLEND')parser.add_argument('--preserve-colors', action='store_true',dest='preserve_colors', help='style-only transfer (preserving colors) - if color transfer is not needed')parser.add_argument('--pooling',dest='pooling', help='pooling layer configuration: max or avg (default %(default)s)',metavar='POOLING', default=POOLING)return parserdef main():parser = build_parser()options = parser.parse_args()if not os.path.isfile(options.network):parser.error("Network %s does not exist. (Did you forget to download it?)" % options.network)content_image = imread(options.content)style_images = [imread(style) for style in options.styles]width = options.widthif width is not None:new_shape = (int(math.floor(float(content_image.shape[0]) /content_image.shape[1] * width)), width)content_image = scipy.misc.imresize(content_image, new_shape)target_shape = content_image.shapefor i in range(len(style_images)):style_scale = STYLE_SCALEif options.style_scales is not None:style_scale = options.style_scales[i]style_images[i] = scipy.misc.imresize(style_images[i], style_scale *target_shape[1] / style_images[i].shape[1])style_blend_weights = options.style_blend_weightsif style_blend_weights is None:# default is equal weightsstyle_blend_weights = [1.0/len(style_images) for _ in style_images]else:total_blend_weight = sum(style_blend_weights)style_blend_weights = [weight/total_blend_weightfor weight in style_blend_weights]initial = options.initialif initial is not None:initial = scipy.misc.imresize(imread(initial), content_image.shape[:2])# Initial guess is specified, but not noiseblend - no noise should be blendedif options.initial_noiseblend is None:options.initial_noiseblend = 0.0else:# Neither inital, nor noiseblend is provided, falling back to random generated initial guessif options.initial_noiseblend is None:options.initial_noiseblend = 1.0if options.initial_noiseblend < 1.0:initial = content_imageif options.checkpoint_output and "%s" not in options.checkpoint_output:parser.error("To save intermediate images, the checkpoint output ""parameter must contain `%s` (e.g. `foo%s.jpg`)")for iteration, image in stylize(network=options.network,initial=initial,initial_noiseblend=options.initial_noiseblend,content=content_image,styles=style_images,preserve_colors=options.preserve_colors,iterations=options.iterations,content_weight=options.content_weight,content_weight_blend=options.content_weight_blend,style_weight=options.style_weight,style_layer_weight_exp=options.style_layer_weight_exp,style_blend_weights=style_blend_weights,tv_weight=options.tv_weight,learning_rate=options.learning_rate,beta1=options.beta1,beta2=options.beta2,epsilon=options.epsilon,pooling=options.pooling,print_iterations=options.print_iterations,checkpoint_iterations=options.checkpoint_iterations):output_file = Nonecombined_rgb = imageif iteration is not None:if options.checkpoint_output:output_file = options.checkpoint_output % iterationelse:output_file = options.outputif output_file:imsave(output_file, combined_rgb)def imread(path):img = scipy.misc.imread(path).astype(np.float)if len(img.shape) == 2:# grayscaleimg = np.dstack((img,img,img))elif img.shape[2] == 4:# PNG with alpha channelimg = img[:,:,:3]return imgdef imsave(path, img):img = np.clip(img, 0, 255).astype(np.uint8)Image.fromarray(img).save(path, quality=95)if __name__ == '__main__':main()

核心代码stylize.py，详解如下：

# Copyright (c) 2015-2017 Anish Athalye. Released under GPLv3.import vggimport tensorflow as tf
import numpy as npfrom sys import stderrfrom PIL import ImageCONTENT_LAYERS = ('relu4_2', 'relu5_2')
STYLE_LAYERS = ('relu1_1', 'relu2_1', 'relu3_1', 'relu4_1', 'relu5_1')try:reduce
except NameError:from functools import reducedef stylize(network, initial, initial_noiseblend, content, styles, preserve_colors, iterations,content_weight, content_weight_blend, style_weight, style_layer_weight_exp, style_blend_weights, tv_weight,learning_rate, beta1, beta2, epsilon, pooling,print_iterations=None, checkpoint_iterations=None):"""Stylize images.This function yields tuples (iteration, image); `iteration` is Noneif this is the final image (the last iteration).  Other tuples are yieldedevery `checkpoint_iterations` iterations.:rtype: iterator[tuple[int|None,image]]"""#content.shape是三维（height, width, channel），这里将维度变成（1, height, width, channel）为了与后面保持一致。shape = (1,) + content.shapestyle_shapes = [(1,) + style.shape for style in styles]content_features = {}style_features = [{} for _ in styles]vgg_weights, vgg_mean_pixel = vgg.load_net(network)layer_weight = 1.0style_layers_weights = {}for style_layer in STYLE_LAYERS:style_layers_weights[style_layer] = layer_weightlayer_weight *= style_layer_weight_exp# normalize style layer weightslayer_weights_sum = 0for style_layer in STYLE_LAYERS:layer_weights_sum += style_layers_weights[style_layer]for style_layer in STYLE_LAYERS:style_layers_weights[style_layer] /= layer_weights_sum#首先创建一个image的占位符，然后通过eval()的feed_dict将content_pre传给image，启动net的运算过程，得到了content的feature maps# compute content features in feedforward modeg = tf.Graph()with g.as_default(), g.device('/cpu:0'), tf.Session() as sess:image = tf.placeholder('float', shape=shape)net = vgg.net_preloaded(vgg_weights, image, pooling)content_pre = np.array([vgg.preprocess(content, vgg_mean_pixel)])for layer in CONTENT_LAYERS:content_features[layer] = net[layer].eval(feed_dict={image: content_pre})# compute style features in feedforward modefor i in range(len(styles)):g = tf.Graph()with g.as_default(), g.device('/cpu:0'), tf.Session() as sess:image = tf.placeholder('float', shape=style_shapes[i])net = vgg.net_preloaded(vgg_weights, image, pooling)style_pre = np.array([vgg.preprocess(styles[i], vgg_mean_pixel)])for layer in STYLE_LAYERS:features = net[layer].eval(feed_dict={image: style_pre})features = np.reshape(features, (-1, features.shape[3]))gram = np.matmul(features.T, features) / features.sizestyle_features[i][layer] = graminitial_content_noise_coeff = 1.0 - initial_noiseblend# make stylized image using backpropogationwith tf.Graph().as_default():if initial is None:noise = np.random.normal(size=shape, scale=np.std(content) * 0.1)initial = tf.random_normal(shape) * 0.256else:initial = np.array([vgg.preprocess(initial, vgg_mean_pixel)])initial = initial.astype('float32')noise = np.random.normal(size=shape, scale=np.std(content) * 0.1)initial = (initial) * initial_content_noise_coeff + (tf.random_normal(shape) * 0.256) * (1.0 - initial_content_noise_coeff)'''image = tf.Variable(initial)初始化了一个TensorFlow的变量，即为我们需要训练的对象。注意这里我们训练的对象是一张图像，而不是weight和bias。'''image = tf.Variable(initial)net = vgg.net_preloaded(vgg_weights, image, pooling)# content losscontent_layers_weights = {}content_layers_weights['relu4_2'] = content_weight_blendcontent_layers_weights['relu5_2'] = 1.0 - content_weight_blendcontent_loss = 0content_losses = []for content_layer in CONTENT_LAYERS:content_losses.append(content_layers_weights[content_layer] * content_weight * (2 * tf.nn.l2_loss(net[content_layer] - content_features[content_layer]) /content_features[content_layer].size))content_loss += reduce(tf.add, content_losses)# style lossstyle_loss = 0'''由于style图像可以输入多幅，这里使用for循环。同样的，将style_pre传给image占位符，启动net运算，得到了style的feature maps，由于style为不同filter响应的内积，因此在这里增加了一步：gram = np.matmul(features.T, features) / features.size，即为style的feature。'''for i in range(len(styles)):style_losses = []for style_layer in STYLE_LAYERS:layer = net[style_layer]_, height, width, number = map(lambda i: i.value, layer.get_shape())size = height * width * numberfeats = tf.reshape(layer, (-1, number))gram = tf.matmul(tf.transpose(feats), feats) / sizestyle_gram = style_features[i][style_layer]style_losses.append(style_layers_weights[style_layer] * 2 * tf.nn.l2_loss(gram - style_gram) / style_gram.size)style_loss += style_weight * style_blend_weights[i] * reduce(tf.add, style_losses)# total variation denoisingtv_y_size = _tensor_size(image[:,1:,:,:])tv_x_size = _tensor_size(image[:,:,1:,:])tv_loss = tv_weight * 2 * ((tf.nn.l2_loss(image[:,1:,:,:] - image[:,:shape[1]-1,:,:]) /tv_y_size) +(tf.nn.l2_loss(image[:,:,1:,:] - image[:,:,:shape[2]-1,:]) /tv_x_size))# overall loss'''接下来定义了Content Loss和Style Loss，结合文中的公式很容易看懂，在代码中，还增加了total variation denoising，因此总的loss = content_loss + style_loss + tv_loss'''loss = content_loss + style_loss + tv_loss# optimizer setup#创建train_step，使用Adam优化器，优化对象是上面的loss#优化过程，通过迭代使用train_step来最小化loss，最终得到一个best，即为训练优化的结果train_step = tf.train.AdamOptimizer(learning_rate, beta1, beta2, epsilon).minimize(loss)def print_progress():stderr.write('  content loss: %g\n' % content_loss.eval())stderr.write('    style loss: %g\n' % style_loss.eval())stderr.write('       tv loss: %g\n' % tv_loss.eval())stderr.write('    total loss: %g\n' % loss.eval())# optimizationbest_loss = float('inf')best = Nonewith tf.Session() as sess:sess.run(tf.global_variables_initializer())stderr.write('Optimization started...\n')if (print_iterations and print_iterations != 0):print_progress()for i in range(iterations):stderr.write('Iteration %4d/%4d\n' % (i + 1, iterations))train_step.run()last_step = (i == iterations - 1)if last_step or (print_iterations and i % print_iterations == 0):print_progress()if (checkpoint_iterations and i % checkpoint_iterations == 0) or last_step:this_loss = loss.eval()if this_loss < best_loss:best_loss = this_lossbest = image.eval()img_out = vgg.unprocess(best.reshape(shape[1:]), vgg_mean_pixel)if preserve_colors and preserve_colors == True:original_image = np.clip(content, 0, 255)styled_image = np.clip(img_out, 0, 255)# Luminosity transfer steps:# 1. Convert stylized RGB->grayscale accoriding to Rec.601 luma (0.299, 0.587, 0.114)# 2. Convert stylized grayscale into YUV (YCbCr)# 3. Convert original image into YUV (YCbCr)# 4. Recombine (stylizedYUV.Y, originalYUV.U, originalYUV.V)# 5. Convert recombined image from YUV back to RGB# 1styled_grayscale = rgb2gray(styled_image)styled_grayscale_rgb = gray2rgb(styled_grayscale)# 2styled_grayscale_yuv = np.array(Image.fromarray(styled_grayscale_rgb.astype(np.uint8)).convert('YCbCr'))# 3original_yuv = np.array(Image.fromarray(original_image.astype(np.uint8)).convert('YCbCr'))# 4w, h, _ = original_image.shapecombined_yuv = np.empty((w, h, 3), dtype=np.uint8)combined_yuv[..., 0] = styled_grayscale_yuv[..., 0]combined_yuv[..., 1] = original_yuv[..., 1]combined_yuv[..., 2] = original_yuv[..., 2]# 5img_out = np.array(Image.fromarray(combined_yuv, 'YCbCr').convert('RGB'))yield ((None if last_step else i),img_out)def _tensor_size(tensor):from operator import mulreturn reduce(mul, (d.value for d in tensor.get_shape()), 1)def rgb2gray(rgb):return np.dot(rgb[...,:3], [0.299, 0.587, 0.114])def gray2rgb(gray):w, h = gray.shapergb = np.empty((w, h, 3), dtype=np.float32)rgb[:, :, 2] = rgb[:, :, 1] = rgb[:, :, 0] = grayreturn rgb

1.环境介绍

Python3.6
TensorFlow 1.2
VGG19

2.我们在cmd命令行中打入下面代码(我的图片都放在examples/下）：
(打开anaconda prompt，输入activate tensorflow，激活TensorFlow)
python neural_style.py --content examples/cat.jpg --styles examples/2-style1.jpg --output examples/y-output.jpg
然后我们看到计算机已经开始进行风格转移:

3.或者参考：https://blog.csdn.net/qq_30611601/article/details/79007202

参考1：
参考2：

Tensorflow实现Neural Style图像风格转移相关推荐

图像迁移风格保存模型_CV之NS：图像风格迁移(Neural Style 图像风格变换)算法简介、关键步骤配图、案例应用...
CV之NS:图像风格迁移(Neural Style 图像风格变换)算法简介.过程思路.关键步骤配图.案例应用之详细攻略目录图像风格迁移算法简介图像风格迁移算法过程思路 1.VGG对比NS 图像风 ...
CV之NS：图像风格迁移(Neural Style 图像风格变换)算法简介、过程思路、关键步骤配图、案例应用之详细攻略
CV之NS:图像风格迁移(Neural Style 图像风格变换)算法简介.过程思路.关键步骤配图.案例应用之详细攻略目录图像风格迁移算法简介图像风格迁移算法过程思路 1.VGG对比NS 图像风 ...
Neural Style Transfer 风格迁移经典论文讲解与 PyTorch 实现
今天花半小时看懂了"Image Style Transfer Using Convolutional Neural Networks Leon"这篇论文,又花半小时看懂了其 PyT ...
深度-图像风格变换【二】
深度卷积神经网络图像风格变换 Deep Photo Style Transfer Taylor Guo, 2017年4月23日星期日 - 4月27日星期四摘要本文介绍了深度学习方法的图像风格转换 ...
Neural Style Transfer: A Review
这篇是风格转移方面的综述,文中总结了多种风格转移的方法.因为18年看过很多风格转移方面的论文,因此翻译这篇综述. Gatys等人的开创性工作.通过分离和重新组合图像内容和风格,展示了卷积神经网络(CN ...
【图像风格转换】项目参考资料总结
实现图像风格转换.神经风格图像的一些资料和实现. 1.原理篇 A.A Neural Algorithm of Artistic Style,风格迁移开山之作实现,对将固定风格迁移到固定内容上: B.P ...
基于深度学习的图像风格迁移算法的基本介绍。
基于神经网络的图像风格迁移算法这个算法还是蛮有趣的,之前就有宣传说让电脑来学习梵高作画,虽然有些夸张,但是实际效果出来还是挺不错的. 接下来,我们要按照以下三个部分来进行介绍,提出,方法以及结论. ...
(高能预警!)为什么Gram矩阵可以代表图像风格？带你揭开图像风格迁移的神秘面纱!
文章目录 (高能预警)为什么Gram矩阵可以代表图像风格简介风格迁移概述领域适应相关知识 Gram矩阵特征值分解核函数希尔伯特空间可再生核希尔伯特空间最大平均差异(MMD) 图像风格 ...
AAAI2020/风格迁移:Ultrafast Photorealistic Style Transfer via Neural Architecture基于神经结构搜索的超快逼真风格转移
AAAI2020/风格迁移:Ultrafast Photorealistic Style Transfer via Neural Architecture基于神经结构搜索的超快逼真风格转移 0.摘要 ...

Tensorflow实现Neural Style图像风格转移

Tensorflow实现Neural Style图像风格转移相关推荐

最新文章

热门文章