前言

上次的博客写了神经风格迁移(Neural Style Transfer)程序实现(Keras),使用keras的一个好处就是api简单,能够快速部署模型,使用很方便。出于学习目的,这次又使用caffe实现了一遍,整体思路跟前面的差不多,就不多说了。详细可以参考论文:一个艺术风格化的神经网络算法(A Neural Algorithm of Artistic Style)(译)。

程序

不说废话了,直接上代码。

log.py

# *_*coding:utf-8 *_*
# author: 许鸿斌
# 邮箱:2775751197@qq.comimport logging
import sys# 获取logger实例,如果参数为空则返回root logger
logger = logging.getLogger('Test')
# 指定logger输出格式
LOG_FORMAT = "%(filename)s:%(funcName)s:%(asctime)s.%(msecs)03d -- %(message)s"
# formatter = logging.Formatter('%(asctime)s %(levelname)-8s: %(message)s')
formatter = logging.Formatter(LOG_FORMAT)
# 文件日志
# file_handler = logging.FileHandler("test.log")
# file_handler.setFormatter(formatter)  # 可以通过setFormatter指定输出格式
# 控制台日志
console_handler = logging.StreamHandler(sys.stdout)
console_handler.formatter = formatter  # 也可以直接给formatter赋值
# 为logger添加的日志处理器
# logger.addHandler(file_handler)
logger.addHandler(console_handler)
# 指定日志的最低输出级别,默认为WARN级别
logger.setLevel(logging.INFO)

style_transfer.py

# *_*coding:utf-8 *_*
# author: 许鸿斌
# 邮箱:2775751197@qq.com# 日志模块
from log import logger# 导入库
import argparse
import os
import sys
import timeit
import logging# 导入caffe
caffe_root = '/home/xhb/caffe/caffe'
pycaffe_root = os.path.join(caffe_root, 'python')
sys.path.append(pycaffe_root)
import caffeimport numpy as np
import progressbar as pb
from scipy.fftpack import ifftn
from scipy.linalg.blas import sgemm
from scipy.misc import imsave
from scipy.optimize import minimize
from skimage import img_as_ubyte
from skimage.transform import rescale# numeric constants
INF = np.float32(np.inf)
STYLE_SCALE = 1.2# 几个CNN框架:VGG19、VGG16、GOOGLENET、CAFFENET
# 定义了从特定层上取出特征谱作为内容输出或者风格输出
# 默认会使用VGG16
VGG19_WEIGHTS = {"content": {"conv4_2": 1},"style": {"conv1_1": 0.2,"conv2_1": 0.2,"conv3_1": 0.2,"conv4_1": 0.2,"conv5_1": 0.2}}
VGG16_WEIGHTS = {"content": {"conv4_2": 1},"style": {"conv1_1": 0.2,"conv2_1": 0.2,"conv3_1": 0.2,"conv4_1": 0.2,"conv5_1": 0.2}}
GOOGLENET_WEIGHTS = {"content": {"conv2/3x3": 2e-4,"inception_3a/output": 1-2e-4},"style": {"conv1/7x7_s2": 0.2,"conv2/3x3": 0.2,"inception_3a/output": 0.2,"inception_4a/output": 0.2,"inception_5a/output": 0.2}}
CAFFENET_WEIGHTS = {"content": {"conv4": 1},"style": {"conv1": 0.2,"conv2": 0.2,"conv3": 0.2,"conv4": 0.2,"conv5": 0.2}}# argparse
parser = argparse.ArgumentParser(description='Neural Style Transfer', usage='xxx.py -s <style.image> -c <content_image>')
parser.add_argument('-s', '--style_img', type=str, required=True, help='Style (art) image')
parser.add_argument('-c', '--content_img', type=str, required=True, help='Content image')
parser.add_argument('-g', '--gpu_id', default=-1, type=int, required=False, help='GPU device number')
parser.add_argument('-m', '--model', default='vgg16', type=str, required=False, help='Which model to use')
parser.add_argument('-i', '--init', default='content', type=str, required=False, help='initialization strategy')
parser.add_argument("-r", "--ratio", default="1e4", type=str, required=False, help="style-to-content ratio")
parser.add_argument("-n", "--num-iters", default=512, type=int, required=False, help="L-BFGS iterations")
parser.add_argument("-l", "--length", default=512, type=float, required=False, help="maximum image length")
parser.add_argument("-v", "--verbose", action="store_true", required=False, help="print minimization outputs")
parser.add_argument("-o", "--output", default=None, required=False, help="output path")def _compute_style_grad(F, G, G_style, layer):"""Computes style gradient and loss from activation features."""# compute loss and gradient(Fl, Gl) = (F[layer], G[layer])c = Fl.shape[0]**-2 * Fl.shape[1]**-2El = Gl - G_style[layer]loss = c/4 * (El**2).sum()grad = c * sgemm(1.0, El, Fl) * (Fl>0)return loss, graddef _compute_content_grad(F, F_content, layer):"""Computes content gradient and loss from activation features."""# compute loss and gradientFl = F[layer]El = Fl - F_content[layer]loss = (El**2).sum() / 2grad = El * (Fl>0)return loss, graddef _compute_reprs(net_in, net, layers_style, layers_content, gram_scale=1):"""Computes representation matrices for an image."""# input data and forward pass(repr_s, repr_c) = ({}, {})net.blobs["data"].data[0] = net_innet.forward()# loop through combined set of layersfor layer in set(layers_style)|set(layers_content):F = net.blobs[layer].data[0].copy()F.shape = (F.shape[0], -1)repr_c[layer] = Fif layer in layers_style:repr_s[layer] = sgemm(gram_scale, F, F.T)return repr_s, repr_cdef style_optfn(x, net, weights, layers, reprs, ratio):"""Style transfer optimization callback for scipy.optimize.minimize().:param numpy.ndarray x:Flattened data array.:param caffe.Net net:Network to use to generate gradients.:param dict weights:Weights to use in the network.:param list layers:Layers to use in the network.:param tuple reprs:Representation matrices packed in a tuple.:param float ratio:Style-to-content ratio."""# 更新参数layers_style = weights["style"].keys()  # 风格对应的层layers_content = weights["content"].keys()  # 内容对应的层net_in = x.reshape(net.blobs["data"].data.shape[1:])# 计算风格和内容表示(G_style, F_content) = reprs(G, F) = _compute_reprs(net_in, net, layers_style, layers_content)# 反向传播loss = 0net.blobs[layers[-1]].diff[:] = 0for i, layer in enumerate(reversed(layers)):next_layer = None if i == len(layers)-1 else layers[-i-2]grad = net.blobs[layer].diff[0]# 风格部分if layer in layers_style:wl = weights["style"][layer](l, g) = _compute_style_grad(F, G, G_style, layer)loss += wl * l * ratiograd += wl * g.reshape(grad.shape) * ratio# 内容部分if layer in layers_content:wl = weights["content"][layer](l, g) = _compute_content_grad(F, F_content, layer)loss += wl * lgrad += wl * g.reshape(grad.shape)# compute gradientnet.backward(start=layer, end=next_layer)if next_layer is None:grad = net.blobs["data"].diff[0]else:grad = net.blobs[next_layer].diff[0]# format gradient for minimize() functiongrad = grad.flatten().astype(np.float64)return loss, gradclass StyleTransfer(object):"""Style transfer class."""def __init__(self, model_name, use_pbar=True):"""Initialize the model used for style transfer.:param str model_name:Model to use.:param bool use_pbar:Use progressbar flag."""style_path = os.path.abspath(os.path.split(__file__)[0])base_path = os.path.join(style_path, "models", model_name)# 导入各模型的结构文件、预训练权重;均值文件为ImageNet数据集图片的均值,训练时减去;# vgg19if model_name == 'vgg19':model_file = os.path.join(base_path, 'VGG_ILSVRC_19_layers_deploy.prototxt')pretrained_file = os.path.join(base_path, 'VGG_ILSVRC_19_layers.caffemodel')mean_file = os.path.join(base_path, 'ilsvrc_2012_mean.npy')weights = VGG19_WEIGHTS# vgg16elif model_name == 'vgg16':model_file = os.path.join(base_path, 'VGG_ILSVRC_16_layers_deploy.prototxt')pretrained_file = os.path.join(base_path, 'VGG_ILSVRC_16_layers.caffemodel')mean_file = os.path.join(base_path, 'ilsvrc_2012_mean.npy')weights = VGG16_WEIGHTS# googlenetelif model_name == 'googlenet':model_file = os.path.join(base_path, 'deploy.prototxt')pretrained_file = os.path.join(base_path, 'bvlc_googlenet.caffemodel')mean_file = os.path.join(base_path, 'ilsvrc_2012_mean.npy')weights = GOOGLENET_WEIGHTS# caffenetelif model_name == 'caffenet':model_file = os.path.join(base_path, 'deploy.prototxt')pretrained_file = os.path.join(base_path, 'bvlc_reference_caffenet.caffemodel')mean_file = os.path.join(base_path, 'ilsvrc_2012_mean.npy')weights = CAFFENET_WEIGHTSelse:assert False, 'Model not available'# 添加模型和权重self.load_model(model_file, pretrained_file, mean_file)self.weights = weights# 找出属于'style'和'content'的层,存放在layers列表中self.layers = []for layer in self.net.blobs:if layer in self.weights['style'] or layer in self.weights['content']:self.layers.append(layer)self.use_pbar = use_pbar# 设置回调函数if self.use_pbar:def callback(xk):self.grad_iter += 1try:self.pbar.update(self.grad_iter)except:self.pbar.finished = Trueif self._callback is not None:net_in = xk.reshape(self.net.blobs['data'].data.shape[1:])self._callback(self.transformer.deprocess('data', net_in))else:def callback(xk):if self._callback is not None:net_in = xk.reshape(self.net.blobs['data'].data.shape[1:])self._callback(self.transformer.deprocess('data', net_in))self.callback = callbackdef load_model(self, model_file, pretrained_file, mean_file):"""Loads specified model from caffe install (see caffe docs).:param str model_file:Path to model protobuf.:param str pretrained_file:Path to pretrained caffe model.:param str mean_file:Path to mean file."""# caffe中导入网络# 抑制了在控制台打印的输出,也就是去掉了caffe自己默认会打印的那一堆信息null_fds = os.open(os.devnull, os.O_RDWR)out_orig = os.dup(2)os.dup2(null_fds, 2)net = caffe.Net(str(model_file), str(pretrained_file), caffe.TEST)  # 导入模型os.dup2(out_orig, 2)os.close(null_fds)# 配置输入数据格式transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})transformer.set_mean('data', np.load(mean_file).mean(1).mean(1))    # 求均值transformer.set_channel_swap('data', (2, 1, 0)) transformer.set_transpose('data', (2, 0, 1)) transformer.set_raw_scale('data', 255)self.net = netself.transformer = transformerdef get_generated(self):"""Saves the generated image (net input, after optimization).:param str path:Output path."""data = self.net.blobs["data"].dataimg_out = self.transformer.deprocess('data', data)return img_outdef _rescale_net(self, img):"""Rescales the network to fit a particular image."""# get new dimensions and rescale net + transformernew_dims = (1, img.shape[2]) + img.shape[:2]self.net.blobs["data"].reshape(*new_dims)self.transformer.inputs["data"] = new_dimsdef _make_noise_input(self, init):"""Creates an initial input (generated) image."""# specify dimensions and create grid in Fourier domaindims = tuple(self.net.blobs["data"].data.shape[2:]) + \(self.net.blobs["data"].data.shape[1], )  # (height, width, channels)grid = np.mgrid[0:dims[0], 0:dims[1]]# create frequency representation for pink noiseSf = (grid[0] - (dims[0]-1)/2.0) ** 2 + \(grid[1] - (dims[1]-1)/2.0) ** 2Sf[np.where(Sf == 0)] = 1Sf = np.sqrt(Sf)Sf = np.dstack((Sf**int(init),)*dims[2])# apply ifft to create pink noise and normalizeifft_kernel = np.cos(2*np.pi*np.random.randn(*dims)) + \1j*np.sin(2*np.pi*np.random.randn(*dims))img_noise = np.abs(ifftn(Sf * ifft_kernel))img_noise -= img_noise.min()img_noise /= img_noise.max()# preprocess the pink noise imagex0 = self.transformer.preprocess("data", img_noise)return x0def _create_pbar(self, max_iter):"""Creates a progress bar."""self.grad_iter = 0self.pbar = pb.ProgressBar()self.pbar.widgets = ["Optimizing: ", pb.Percentage(), " ", pb.Bar(marker=pb.AnimatedMarker())," ", pb.ETA()]self.pbar.maxval = max_iterdef transfer_style(self, img_style, img_content, length=512, ratio=1e5,n_iter=512, init="-1", verbose=False, callback=None):"""Transfers the style of the artwork to the input image.:param numpy.ndarray img_style:A style image with the desired target style.:param numpy.ndarray img_content:A content image in floating point, RGB format.:param function callback:A callback function, which takes images at iterations."""# 求出'data'层的宽和高较小的一个orig_dim = min(self.net.blobs["data"].shape[2:])# 调整图像尺寸scale = max(length / float(max(img_style.shape[:2])),orig_dim / float(min(img_style.shape[:2])))img_style = rescale(img_style, STYLE_SCALE*scale)scale = max(length / float(max(img_content.shape[:2])),orig_dim / float(min(img_content.shape[:2])))img_content = rescale(img_content, scale)self._rescale_net(img_style)    # 调整风格图像尺寸,设为输入layers = self.weights["style"].keys()   # 取出风格表示所对应的特定层的名字,存在layers里面net_in = self.transformer.preprocess("data", img_style) # 对风格图像预处理,处理成'data'层可接受的格式gram_scale = float(img_content.size)/img_style.size # gram矩阵的维度# 计算风格表示G_style = _compute_reprs(net_in, self.net, layers, [],gram_scale=1)[0]self._rescale_net(img_content)  # 调整内容图像尺寸,设为输入layers = self.weights["content"].keys() # 取出内容表示所对应的特定层的名字,存在layers里面net_in = self.transformer.preprocess("data", img_content)   # 对内容图像预处理,处理成'data'层可接受的格式# 计算内容表示F_content = _compute_reprs(net_in, self.net, [], layers)[1]# 初始化网络输入# 如果是numpy数组,则视作图像,直接将其作为输入;# 如果是"content",则将内容图像作为图像输入;# 如果是"mixed",则将其内容图像与风格图像乘以一定权重输入;# 其他情况,随机初始化噪声作为输入。if isinstance(init, np.ndarray):img0 = self.transformer.preprocess("data", init)elif init == "content":img0 = self.transformer.preprocess("data", img_content)elif init == "mixed":img0 = 0.95*self.transformer.preprocess("data", img_content) + \0.05*self.transformer.preprocess("data", img_style)else:img0 = self._make_noise_input(init)# compute data boundsdata_min = -self.transformer.mean["data"][:,0,0]data_max = data_min + self.transformer.raw_scale["data"]data_bounds = [(data_min[0], data_max[0])] * int(img0.size / 3) + \[(data_min[1], data_max[1])] * int(img0.size / 3) + \[(data_min[2], data_max[2])] * int(img0.size / 3)# 优化问题相关参数grad_method = "L-BFGS-B"reprs = (G_style, F_content)minfn_args = {"args": (self.net, self.weights, self.layers, reprs, ratio),"method": grad_method, "jac": True, "bounds": data_bounds,"options": {"maxcor": 8, "maxiter": n_iter, "disp": verbose}}# 求解优化问题self._callback = callbackminfn_args["callback"] = self.callbackif self.use_pbar and not verbose:self._create_pbar(n_iter)self.pbar.start()res = minimize(style_optfn, img0.flatten(), **minfn_args).nitself.pbar.finish()else:res = minimize(style_optfn, img0.flatten(), **minfn_args).nitreturn resdef main(args):# set level of loggerlevel = logging.INFO if args.verbose else logging.DEBUGlogger.setLevel(level)logger.info('Starting style transfer.')# 设置模式:CPU/GPU,默认CPUif args.gpu_id == -1:caffe.set_mode_cpu()logger.info('Caffe setted on CPU.')else:caffe.set_device(args.gpu_id)caffe.set_mode_gpu()logger.info('Caffe setted on GPU {}'.format(args.gpu_id))# 导入图像style_img = caffe.io.load_image(args.style_img)content_img = caffe.io.load_image(args.content_img)logger.info('Successfully loaded images.')# artistic style classuse_pbar = not args.verbosest = StyleTransfer(args.model.lower(), use_pbar=use_pbar)logging.info("Successfully loaded model {0}.".format(args.model))# 调用style transfer函数start = timeit.default_timer()n_iters = st.transfer_style(style_img, content_img, length=args.length, init=args.init, ratio=np.float(args.ratio), n_iter=args.num_iters, verbose=args.verbose)end = timeit.default_timer()logging.info("Ran {0} iterations in {1:.0f}s.".format(n_iters, end-start))img_out = st.get_generated()# 生成图片输出路径if args.output is not None:out_path = args.outputelse:out_path_fmt = (os.path.splitext(os.path.split(args.content_img)[1])[0], os.path.splitext(os.path.split(args.style_img)[1])[0], args.model, args.init, args.ratio, args.num_iters)out_path = "outputs/{0}-{1}-{2}-{3}-{4}-{5}.jpg".format(*out_path_fmt)# 保存生成的艺术风格图片imsave(out_path, img_as_ubyte(img_out))logging.info("Output saved to {0}.".format(out_path))if __name__ == '__main__':args = parser.parse_args()main(args)

补充说明

还有几点补充说明的:

caffe路径

一定要编译好pycaffe,目录指定到caffe的根目录。

# 导入caffe
caffe_root = '/home/xhb/caffe/caffe' # 自行修改caffe的根目录
pycaffe_root = os.path.join(caffe_root, 'python')
sys.path.append(pycaffe_root)
import caffe

模型文件

因为会用到在ImageNet下预训练好的模型文件,统一保存在models目录中。

我会把百度云链接放在最后,自行下载即可。

图片

随便找一些测试图像即可,但是注意要放到能找到的路径里。
网上随便找的一些图片:
内容图片

风格图片

最后生成的艺术风格图片

运行脚本

python style_transfer.py -s 风格图片路径 -c 内容图片路径

还有其他参数可以配置,一般用默认值就足够了。

后记

仅作学习交流用,如有事请私信。如果有的博文评论不了,请不要把评论发在不相干的地方,请直接私信。重要的事情说两遍!(o´ω`o)

完整工程:
链接:https://pan.baidu.com/s/1O11yEuAn4vRdBUMXW8djkQ 密码:sto6
由于caffemodel文件较大,所以里面没有把caffemodel放进去,需要自行下载。
预训练权重文件:
googlenet:http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel
alexnet:http://dl.caffe.berkeleyvision.org/bvlc_reference_caffenet.caffemodel
vgg16:http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel
vgg19:http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_19_layers.caffemodel

神经风格迁移(Neural Style Transfer)程序实现(Caffe)相关推荐

  1. 吴恩达老师深度学习视频课笔记:神经风格迁移(neural style transfer)

            什么是神经风格迁移(neural style transfer):如下图,Content为原始拍摄的图像,Style为一种风格图像.如果用Style来重新创造Content照片,神经风 ...

  2. 【6.1】图片风格迁移 Neural Style Transfer

    完整代码:  from __future__ import division from torchvision import models from torchvision import transf ...

  3. 图像风格迁移(Neural Style)简史

     图像风格迁移科技树 什么是图像风格迁移? 先上一组图. 以下每一张图都是一种不同的艺术风格.作为非艺术专业的人,我就不扯艺术风格是什么了,每个人都有每个人的见解,有些东西大概艺术界也没明确的定义.如 ...

  4. 图像迁移风格保存模型_CV之NS:图像风格迁移(Neural Style 图像风格变换)算法简介、关键步骤配图、案例应用...

    CV之NS:图像风格迁移(Neural Style 图像风格变换)算法简介.过程思路.关键步骤配图.案例应用之详细攻略 目录 图像风格迁移算法简介 图像风格迁移算法过程思路 1.VGG对比NS 图像风 ...

  5. CV之NS:图像风格迁移(Neural Style 图像风格变换)算法简介、过程思路、关键步骤配图、案例应用之详细攻略

    CV之NS:图像风格迁移(Neural Style 图像风格变换)算法简介.过程思路.关键步骤配图.案例应用之详细攻略 目录 图像风格迁移算法简介 图像风格迁移算法过程思路 1.VGG对比NS 图像风 ...

  6. Pytorch 风格迁移(Style transfer)

    Pytorch 风格迁移 0. 环境介绍 环境使用 Kaggle 里免费建立的 Notebook 教程使用李沐老师的 动手学深度学习 网站和 视频讲解 小技巧:当遇到函数看不懂的时候可以按 Shift ...

  7. CNN实现图像风格迁移 ---Image Style Transfer Using Convolutional Neural Networks

    目录 1. INTRODUCTION 2. Deep image representations 2.1  内容表示 2.2. Style representation 2.3  风格迁移 3. Re ...

  8. 风格迁移(Style Transfer)首次学习总结

    0.写在前面 最近看了吴恩达老师风格迁移相关的讲解视频,深受启发,于是想着做做总结. 1.主要思想 目的:把一张内容图片(content image)的风格迁移成与另一张图片(style image) ...

  9. keras神经风格迁移_知识分享 | 神经风格迁移-把每一张图片都变成自己喜欢的样子...

    原标题:知识分享 | 神经风格迁移-把每一张图片都变成自己喜欢的样子 有 爱 就 有 阳 光 灿 烂 虽然大家总是自嘲,但还是要开开心心的哦~ 选择了打工这条路,也就选择了终身学习 ,今天也让小编带领 ...

  10. 深度学习第55讲:图像的神经风格迁移

    所谓图像的神经风格迁移(Style Transfer),就是指在给定图像A和图像B的情况下,通过神经网络将这两张图像转化为C,且C同时具有图像A的内容和图像B的风格.比如下图左边两张输入图片:一张图像 ...

最新文章

  1. [html] 精确获取页面元素位置的方式有哪些?
  2. Windows Live Write:主流BSP的支持情况
  3. 关于DOS和命令行的故事
  4. 以软件工作为例,传统武术如何实战
  5. html调用一般处理程序方法,Web的初步篇:前台(HTML)和后台(一般处理程序)...
  6. 项目管理:精益管理法
  7. 「mt4软件」均线指标的应用方法
  8. java学习笔记 java编程思想 第5章 初始化与清理
  9. 谭谭牛顿的牛眼之人眼是红外线成像仪谭
  10. 经Jerry为何会失去“编程的十年”(上)
  11. 寒江独钓-Windows内核安全编程笔记-第3章代码和笔记
  12. python Beautiful Soup常用过滤方法
  13. 用Flutter实现小Q聊天机器人(二)
  14. 尚硅谷-谷粒商城-电商项目-秒杀系统-笔记
  15. 2022-2028全球管道式全屋除湿机市场现状及未来发展趋势
  16. 2022电脑安全上网方略
  17. 干货文稿|当模型预测控制遇见机器学习
  18. 电商行业转化率到底如何计算?
  19. 手绘与码绘的比较——实战之梵高《星空》
  20. Morgan Stanley面经

热门文章

  1. java中factory_JAVA工厂方法模式(Factory Method)
  2. vue2.0 通过ip访问自己运行的项目
  3. 菜鸟刚入手Python第一天
  4. 分享成为高效程序员的7个重要习惯
  5. Java虚拟机详解01----初识JVM
  6. ibmm,让思维导图回归本质
  7. 图形处理(十三)基于可变形模板的三维人脸重建-学习笔记
  8. 用 Python 和 OpenCV 检测和跟踪运动对象
  9. 学生档案管理系统(续)
  10. 为什么算法渐进复杂度中对数的底数总为2