

  • 实现神经风格迁移算法
  • 使用算法生成新颖的艺术图像


In [2]:

cd /home/kesci/input/deeplearning122839

In [3]:

import os
import sys
import scipy.io
import scipy.misc
import imageio
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
from PIL import Image
from nst_utils import *
import numpy as np
import tensorflow._api.v2.compat.v1 as tf
%matplotlib inline

1 问题陈述




2 迁移学习




In [4]:

model = load_vgg_model("pretrained-model/imagenet-vgg-verydeep-19.mat")
WARNING:tensorflow:From /opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
{'input': <tf.Variable 'Variable:0' shape=(1, 300, 400, 3) dtype=float32_ref>, 'conv1_1': <tf.Tensor 'Relu:0' shape=(1, 300, 400, 64) dtype=float32>, 'conv1_2': <tf.Tensor 'Relu_1:0' shape=(1, 300, 400, 64) dtype=float32>, 'avgpool1': <tf.Tensor 'AvgPool:0' shape=(1, 150, 200, 64) dtype=float32>, 'conv2_1': <tf.Tensor 'Relu_2:0' shape=(1, 150, 200, 128) dtype=float32>, 'conv2_2': <tf.Tensor 'Relu_3:0' shape=(1, 150, 200, 128) dtype=float32>, 'avgpool2': <tf.Tensor 'AvgPool_1:0' shape=(1, 75, 100, 128) dtype=float32>, 'conv3_1': <tf.Tensor 'Relu_4:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_2': <tf.Tensor 'Relu_5:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_3': <tf.Tensor 'Relu_6:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_4': <tf.Tensor 'Relu_7:0' shape=(1, 75, 100, 256) dtype=float32>, 'avgpool3': <tf.Tensor 'AvgPool_2:0' shape=(1, 38, 50, 256) dtype=float32>, 'conv4_1': <tf.Tensor 'Relu_8:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_2': <tf.Tensor 'Relu_9:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_3': <tf.Tensor 'Relu_10:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_4': <tf.Tensor 'Relu_11:0' shape=(1, 38, 50, 512) dtype=float32>, 'avgpool4': <tf.Tensor 'AvgPool_3:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_1': <tf.Tensor 'Relu_12:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_2': <tf.Tensor 'Relu_13:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_3': <tf.Tensor 'Relu_14:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_4': <tf.Tensor 'Relu_15:0' shape=(1, 19, 25, 512) dtype=float32>, 'avgpool5': <tf.Tensor 'AvgPool_4:0' shape=(1, 10, 13, 512) dtype=float32>}

该模型存储在python字典中,其中每个变量名称都是键,而对应的值是包含该变量值的张量。要通过此网络测试图像,只需要将图像提供给模型。在TensorFlow中,你可以使用 tf.assign函数执行此操作。特别地,你将使用如下的assign函数:


这会将图像分配为模型的输入。此后,如果要访问特定层的激活函数,例如当网络在此图像上运行时说4_2 层,则可以在正确的张量conv4_2上运行TensorFlow会话,如下所示:


3 神经风格迁移


  • 建立内容损失函数 J c o n t e n t ( C , G ) J_{content}(C,G) Jcontent​(C,G);
  • 建立风格损失函数 J s t y l e ( S , G ) J_{style}(S,G) Jstyle​(S,G);
  • 放在一起得出 J ( G ) = α J c o n t e n t ( C , G ) + β J s t y l e ( S , G ) J(G) = \alpha J_{content}(C,G) + \beta J_{style}(S,G) J(G)=αJcontent​(C,G)+βJstyle​(S,G)。

3.1 计算内容损失


In [5]:

content_image = imageio.imread("./week4/images/louvre.jpg")
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:1: DeprecationWarning: `imread` is deprecated!
`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead."""Entry point for launching an IPython kernel.


<matplotlib.image.AxesImage at 0x7f0fb8097d68>


3.1.1 如何确保生成的图像G与图像C的内容匹配?



因此,假设你选择了一个特定的隐藏层使用。现在,将图像C设置为预训练的VGG网络的输入,并进行正向传播。假设 a ( C ) a^{(C)} a(C)是你选择的层中的隐藏层激活。(在课程中,我们将其写为 a [ l ] ( C ) a^{[l](C)} a[l](C),但在这里我们将删除上标 [ l ] [l] [l]以简化表示手法。)这将是张量 n H × n W × n C n_H \times n_W \times n_C nH​×nW​×nC​。对图像G重复此过程:将G设置为输入,然后进行正向传播。令 a ( G ) a^{(G)} a(G)为相应的隐藏层激活。我们将内容损失函数定义为:

J c o n t e n t ( C , G ) = 1 4 × n H × n W × n C ∑ all entries ( a ( C ) − a ( G ) ) 2 (1) J_{content}(C,G) = \frac{1}{4 \times n_H \times n_W \times n_C}\sum _{ \text{all entries}} (a^{(C)} - a^{(G)})^2\tag{1} Jcontent​(C,G)=4×nH​×nW​×nC​1​all entries∑​(a(C)−a(G))2(1)

在这里, n H , n W n_H, n_W nH​,nW​和 n G n_G nG​是你选择的隐藏层的高度,宽度和通道数,并以损失的归一化术语显示。请注意 a ( C ) a^{(C)} a(C)和 a ( G ) a^{(G)} a(G)是与隐藏层的激活对应的。为了计算损失 J c o n t e n t ( C , G ) J_{content}(C,G) Jcontent​(C,G),将这些3D体积展开为2D矩阵更方便,如下所示。(从技术上讲,此展开步骤不需要计算 J c o n t e n t J_{content} Jcontent​,但是对于之后需要进行类似操作以计算样式 J s t y l e J_{style} Jstyle​常数的情况来说,这将是一个很好的实践。)



  1. 从a_G检索尺寸:
    - 要从张量X检索尺寸,请使用: X.get_shape().as_list()
  2. 如上图所示展开a_C和a_G
    - 如果遇到问题,请查看 Hint1 和 Hint2。
  3. 计算内容损失:
    - 如果遇到问题,请查看 Hint3, Hint4 和 Hint5。

In [6]:

# GRADED FUNCTION: compute_content_costdef compute_content_cost(a_C, a_G):"""Computes the content costArguments:a_C -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing content of the image C a_G -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing content of the image GReturns: J_content -- scalar that you compute using equation 1 above."""### START CODE HERE #### Retrieve dimensions from a_G (≈1 line)m, n_H, n_W, n_C = a_G.get_shape().as_list()# Reshape a_C and a_G (≈2 lines)a_C_unrolled = tf.reshape(a_C,shape=(n_H* n_W,n_C))a_G_unrolled = tf.reshape(a_G,shape=(n_H* n_W,n_C))# compute the cost with tensorflow (≈1 line)J_content = tf.reduce_sum(tf.square(tf.subtract(a_C_unrolled,a_G_unrolled)))/(4*n_H*n_W*n_C)### END CODE HERE ###return J_content

In [7]:

tf.reset_default_graph()with tf.Session() as test:tf.set_random_seed(1)a_C = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)a_G = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)J_content = compute_content_cost(a_C, a_G)print("J_content = " + str(J_content.eval()))
J_content = 6.7655935

J_content = 6.7655935


  • 内容损失需要对神经网络进行隐藏层激活,并计算 a ( C ) a^{(C)} a(C) 和 a ( G ) a^{(G)} a(G)之间的差异。
  • 当我们在最小化内容损失时,这将有助于确保 G G G具有与 C C C类似的内容。

3.2 计算风格损失


In [8]:

style_image = scipy.misc.imread("images/monet_800600.jpg")
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:1: DeprecationWarning: `imread` is deprecated!
`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead."""Entry point for launching an IPython kernel.


<matplotlib.image.AxesImage at 0x7f0f585e4780>


让我们看看如何定义“风格”常数函数 J s t y l e ( S , G ) J_{style}(S,G) Jstyle​(S,G)。

3.2.1 风格矩阵

风格矩阵也称为“语法矩阵”。在线性代数中,向量 ( v 1 , … , v n ) (v_{1},\dots ,v_{n}) (v1​,…,vn​)的集合的Gram矩阵G是点积的矩阵,其项是 G i j = v i T v j = n p . d o t ( v i , v j ) {\displaystyle G_{ij} = v_{i}^T v_{j} = np.dot(v_{i}, v_{j}) } Gij​=viT​vj​=np.dot(vi​,vj​)。换句话说, G i j G_{ij} Gij​ 比较 v i v_i vi​ 与 v j v_j vj​的相似度:如果它们非常相似,则它们会具有较大的点积,因此 G i j G_{ij} Gij​也会较大。



结果是维度为 ( n C , n C ) (n_C,n_C) (nC​,nC​)的矩阵,其中 n C n_C nC​是滤波器的数量。值 G i j G_{ij} Gij​衡量滤波器i的激活与滤波器j的激活的相似度。

语法矩阵的一个重要部分是对角元素(例如 G i i G_{ii} Gii​)也可以衡量滤波器i的活跃程度。例如,假设滤波器i正在检测图像中的垂直纹理。然后 G i i G_{ii} Gii​衡量整个图像中垂直纹理的普遍程度:如果 G i i G_{ii} Gii​大,则意味着图像具有很多垂直纹理。

通过捕获不同类型特征的普遍性( G i i G_{ii} Gii​)以及一起出现多少不同特征( G i i G_{ii} Gii​),样式矩阵 G G G可以衡量图像的样式。

使用TensorFlow实现一个计算矩阵A的语法矩阵的函数。公式为:A的语法矩阵为 G A = A A T G_A = AA^T GA​=AAT。如果遇到问题,请查看Hint 1 和 Hint 2。

In [9]:

# GRADED FUNCTION: gram_matrixdef gram_matrix(A):"""Argument:A -- matrix of shape (n_C, n_H*n_W)Returns:GA -- Gram matrix of A, of shape (n_C, n_C)"""### START CODE HERE ### (≈1 line)GA = tf.matmul(A,tf.transpose(A))### END CODE HERE ###return GA

In [10]:

tf.reset_default_graph()with tf.Session() as test:tf.set_random_seed(1)A = tf.random_normal([3, 2*1], mean=1, stddev=4)GA = gram_matrix(A)print("GA = " + str(GA.eval()))
GA = [[ 6.422305 -4.429122 -2.096682][-4.429122 19.465837 19.563871][-2.096682 19.563871 20.686462]]

GA = [[ 6.422305 -4.429122 -2.096682]
[-4.429122 19.465837 19.563871]
[-2.096682 19.563871 20.686462]]

3.2.2 风格损失

生成风格矩阵(Gram矩阵)后,你的目标是使"style"图像S的Gram矩阵和生成的图像G的Gram矩阵之间的距离最小。现在,我们仅使用单个隐藏层 a [ l ] a^{[l]} a[l],该层的相应的风格损失定义为:
J s t y l e [ l ] ( S , G ) = 1 4 × n C 2 × ( n H × n W ) 2 ∑ i = 1 n C ∑ j = 1 n C ( G i j ( S ) − G i j ( G ) ) 2 (2) J_{style}^{[l]}(S,G) = \frac{1}{4 \times {n_C}^2 \times (n_H \times n_W)^2} \sum _{i=1}^{n_C}\sum_{j=1}^{n_C}(G^{(S)}_{ij} - G^{(G)}_{ij})^2\tag{2} Jstyle[l]​(S,G)=4×nC​2×(nH​×nW​)21​i=1∑nC​​j=1∑nC​​(Gij(S)​−Gij(G)​)2(2)

其中 G ( S ) G^{(S)} G(S)和 G ( G ) G^{(G)} G(G)分别是“风格”图像和“生成的”图像的语法矩阵,使用针对网络中特定的隐藏层的激活来计算。



  1. 从隐藏层激活a_G中检索尺寸:
    - 要从张量X检索尺寸,请使用:X.get_shape().as_list()
  2. 如上图所示,将隐藏层激活a_S和a_G展开为2D矩阵。
    - 你可能会发现Hint1和Hint2有用。
  3. 计算图像S和G的风格矩阵。(使用以前编写的函数)
  4. 计算风格损失:
    - 你可能会发现 Hint3, Hint4 和 Hint5 有用。

In [11]:

# GRADED FUNCTION: compute_layer_style_costdef compute_layer_style_cost(a_S, a_G):"""Arguments:a_S -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing style of the image S a_G -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing style of the image GReturns: J_style_layer -- tensor representing a scalar value, style cost defined above by equation (2)"""### START CODE HERE #### Retrieve dimensions from a_G (≈1 line)m, n_H, n_W, n_C = a_G.get_shape().as_list()# Reshape the images to have them of shape (n_C, n_H*n_W) (≈2 lines)a_S = tf.reshape(a_S,shape=(n_H* n_W,n_C))a_G = tf.reshape(a_G,shape=(n_H* n_W,n_C))# Computing gram_matrices for both images S and G (≈2 lines)GS = gram_matrix(tf.transpose(a_S))GG = gram_matrix(tf.transpose(a_G))# Computing the loss (≈1 line)J_style_layer =tf.reduce_sum(tf.square(tf.subtract(GS,GG)))/(4*(n_C*n_C)*(n_W * n_H) * (n_W * n_H))### END CODE HERE ###return J_style_layer

In [12]:

tf.reset_default_graph()with tf.Session() as test:tf.set_random_seed(1)a_S = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)a_G = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)J_style_layer = compute_layer_style_cost(a_S, a_G)print("J_style_layer = " + str(J_style_layer.eval()))
J_style_layer = 9.190278

J_style_layer = 9.190278

3.2.3 风格权重


In [13]:

STYLE_LAYERS = [('conv1_1', 0.2),('conv2_1', 0.2),('conv3_1', 0.2),('conv4_1', 0.2),('conv5_1', 0.2)]

J s t y l e ( S , G ) = ∑ l λ [ l ] J s t y l e [ l ] ( S , G ) J_{style}(S,G) = \sum_{l} \lambda^{[l]} J^{[l]}_{style}(S,G) Jstyle​(S,G)=l∑​λ[l]Jstyle[l]​(S,G)

λ [ l ] \lambda^{[l]} λ[l]的值在STYLE_LAYERS中给出。


2.从STYLE_LAYERS循环(layer_name,coeff):  a. 选择当前层的输出张量 例如,要从层"conv1_1"中调用张量,你可以这样做:out = model["conv1_1"]  b. 通过在张量"out"上运行会话,从当前层获取style图像的风格  C. 获取一个表示当前层生成的图像风格的张量。 这只是"out"。  d. 现在,你拥有两种风格。使用上面实现的函数计算当前层的style_cost  e. 将当前层的(style_cost x coeff)添加到整体风格损失(J_style)中
3.返回J_style,它现在应该是每层的(style_cost x coeff)之和。

In [14]:

def compute_style_cost(model, STYLE_LAYERS):"""Computes the overall style cost from several chosen layersArguments:model -- our tensorflow modelSTYLE_LAYERS -- A python list containing:- the names of the layers we would like to extract style from- a coefficient for each of themReturns: J_style -- tensor representing a scalar value, style cost defined above by equation (2)"""# initialize the overall style costJ_style = 0for layer_name, coeff in STYLE_LAYERS:# Select the output tensor of the currently selected layerout = model[layer_name]# Set a_S to be the hidden layer activation from the layer we have selected, by running the session on outa_S = sess.run(out)# Set a_G to be the hidden layer activation from same layer. Here, a_G references model[layer_name] # and isn't evaluated yet. Later in the code, we'll assign the image G as the model input, so that# when we run the session, this will be the activations drawn from the appropriate layer, with G as input.a_G = out# Compute style_cost for the current layerJ_style_layer = compute_layer_style_cost(a_S, a_G)# Add coeff * J_style_layer of this layer to overall style costJ_style += coeff * J_style_layerreturn J_style



  • 可以使用隐藏层激活的Gram矩阵表示图像的风格。但是,结合多个不同层的语法矩阵表示,我们可以获得更好的结果。这与内容表示法相反,后者通常仅使用一个隐藏层就足够了。
  • 最小化风格损失将导致图像G遵循图像S的风格。

3.3 定义优化的总损失

J ( G ) = α J c o n t e n t ( C , G ) + β J s t y l e ( S , G ) J(G) = \alpha J_{content}(C,G) + \beta J_{style}(S,G) J(G)=αJcontent​(C,G)+βJstyle​(S,G)


In [15]:

# GRADED FUNCTION: total_costdef total_cost(J_content, J_style, alpha = 10, beta = 40):"""Computes the total cost functionArguments:J_content -- content cost coded aboveJ_style -- style cost coded abovealpha -- hyperparameter weighting the importance of the content costbeta -- hyperparameter weighting the importance of the style costReturns:J -- total cost as defined by the formula above."""### START CODE HERE ### (≈1 line)J = alpha*J_content+beta*J_style### END CODE HERE ###return J

In [16]:

tf.reset_default_graph()with tf.Session() as test:np.random.seed(3)J_content = np.random.randn()    J_style = np.random.randn()J = total_cost(J_content, J_style)print("J = " + str(J))
J = 35.34667875478276

J = 35.34667875478276


  • 总损失是内容损失 J c o n t e n t ( C , G ) J_{content}(C,G) Jcontent​(C,G)和风格损失 J s t y l e ( S , G ) J_{style}(S,G) Jstyle​(S,G)的线性组合
  • α和β是控制内容和风格之间相对权重的超参数

4 解决优化问题



  1. 创建一个交互式会话
  2. 加载内容图像
  3. 加载风格图像
  4. 随机初始化要生成的图像
  5. 加载VGG16模型
  6. 构建TensorFlow计算图:
    • 通过VGG16模型运行内容图像并计算内容损失
    • 通过VGG16模型运行风格图像并计算风格损失
    • 计算总损失
    • 定义优化器和学习率
  7. 初始化TensorFlow图,并运行大量迭代,然后在每个步骤更新生成的图像。


你之前已经实现了总损失 J ( G ) J(G) J(G),我们现在将设置TensorFlow来针对G进行优化。 为此,你的程序必须重置计算图并使用"Interactive Session"。与常规会话不同,交互式会话将启动自身作为默认会话以构建计算图。这使你可以运行变量而无需经常引用会话对象,从而简化了代码。


In [17]:

# Reset the graph
tf.reset_default_graph()# Start interactive session
sess = tf.InteractiveSession()


In [18]:

content_image = scipy.misc.imread("images/louvre_small.jpg")
content_image = reshape_and_normalize_image(content_image)
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:1: DeprecationWarning: `imread` is deprecated!
`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead."""Entry point for launching an IPython kernel.


In [19]:

style_image = scipy.misc.imread("images/monet.jpg")
style_image = reshape_and_normalize_image(style_image)
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:1: DeprecationWarning: `imread` is deprecated!
`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead."""Entry point for launching an IPython kernel.


In [20]:

generated_image = generate_noise_image(content_image)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).


<matplotlib.image.AxesImage at 0x7f0f58527f98>


In [21]:

model = load_vgg_model("pretrained-model/imagenet-vgg-verydeep-19.mat")



In [22]:

# Assign the content image to be the input of the VGG model.
sess.run(model['input'].assign(content_image))# Select the output tensor of layer conv4_2
out = model['conv4_2']# Set a_C to be the hidden layer activation from the layer we have selected
a_C = sess.run(out)# Set a_G to be the hidden layer activation from same layer. Here, a_G references model['conv4_2']
# and isn't evaluated yet. Later in the code, we'll assign the image G as the model input, so that
# when we run the session, this will be the activations drawn from the appropriate layer, with G as input.
a_G = out# Compute the content cost
J_content = compute_content_cost(a_C, a_G)


In [23]:

# Assign the input of the model to be the "style" image
sess.run(model['input'].assign(style_image))# Compute the style cost
J_style = compute_style_cost(model, STYLE_LAYERS)

练习:现在你有了J_content和J_style,通过调用total_cost()计算总损失J。 使用alpha = 10beta = 40

In [24]:

### START CODE HERE ### (1 line)
J = total_cost(J_content, J_style, alpha = 10, beta = 40)

你之前已经学习了如何在TensorFlow中设置Adam优化器。我们在这里使用2.0的学习率。 See reference

In [25]:

# define optimizer (1 line)
optimizer = tf.train.AdamOptimizer(2.0)# define train_step (1 line)
train_step = optimizer.minimize(J)


In [26]:

def model_nn(sess, input_image, num_iterations = 200):# Initialize global variables (you need to run the session on the initializer)### START CODE HERE ### (1 line)sess.run(tf.global_variables_initializer())### END CODE HERE #### Run the noisy input image (initial generated image) through the model. Use assign().### START CODE HERE ### (1 line)generated_image=sess.run(model['input'].assign(input_image))### END CODE HERE ###for i in range(num_iterations):# Run the session on the train_step to minimize the total cost### START CODE HERE ### (1 line)sess.run(train_step)### END CODE HERE #### Compute the generated image by running the session on the current model['input']### START CODE HERE ### (1 line)generated_image = sess.run(model['input'])### END CODE HERE #### Print every 20 iteration.if i%20 == 0:Jt, Jc, Js = sess.run([J, J_content, J_style])print("Iteration " + str(i) + " :")print("total cost = " + str(Jt))print("content cost = " + str(Jc))print("style cost = " + str(Js))# save current generated image in the "/output" directorysave_image("output/" + str(i) + ".png", generated_image)# save last generated imagesave_image('output/generated_image.jpg', generated_image)return generated_image


In [27]:

model_nn(sess, generated_image)
Iteration 0 :
total cost = 5050363000.0
content cost = 7877.685
style cost = 126257096.0
Iteration 20 :
total cost = 943329150.0
content cost = 15185.644
style cost = 23579432.0
Iteration 40 :
total cost = 484951500.0
content cost = 16785.02
style cost = 12119591.0
Iteration 60 :
total cost = 312597280.0
content cost = 17465.904
style cost = 7810565.5
Iteration 80 :
total cost = 228104380.0
content cost = 17716.314
style cost = 5698180.5
Iteration 100 :
total cost = 180687200.0
content cost = 17897.129
style cost = 4512705.5
Iteration 120 :
total cost = 150027140.0
content cost = 18027.883
style cost = 3746171.5
Iteration 140 :
total cost = 127800170.0
content cost = 18183.773
style cost = 3190458.2
Iteration 160 :
total cost = 110766376.0
content cost = 18345.64
style cost = 2764573.0
Iteration 180 :
total cost = 97408580.0
content cost = 18485.322
style cost = 2430593.0


array([[[[-4.7703003e+01, -6.1550846e+01,  4.8832047e+01],[-2.6217243e+01, -4.0605331e+01,  2.7117752e+01],[-4.1859707e+01, -2.9068708e+01,  1.1424817e+01],...,[-2.6815302e+01, -9.4887362e+00,  1.4312043e+01],[-3.0232523e+01, -2.8368409e+00,  2.4125320e+01],[-4.2427952e+01, -4.0595369e+00,  4.9285820e+01]],[[-6.1166412e+01, -5.1801685e+01,  2.5211609e+01],[-3.3077774e+01, -3.1084211e+01, -1.4618548e+00],[-2.7152342e+01, -3.0673979e+01,  1.5284797e+01],...,[-2.6783518e+01, -5.2329774e+00,  2.5989519e+01],[-2.1495461e+01, -1.6955070e+01,  1.3927784e+01],[-4.0914680e+01, -6.1364961e+00,  9.5178242e+00]],[[-5.2238056e+01, -5.1577118e+01,  1.3660212e+01],[-3.7201691e+01, -4.1442902e+01, -6.2615275e+00],[-3.4184612e+01, -2.5266361e+01,  7.4517899e+00],...,[-1.0611511e+01, -3.7342052e+01,  1.2602094e+01],[-1.2320594e+01, -2.1005487e+01,  1.7151941e+01],[-2.2562677e+01, -1.8515341e+01,  1.4285409e+01]],...,[[-4.9188065e+01, -5.5147198e+01, -3.7267464e+01],[-9.9028038e+01, -7.8261818e+01, -2.6933994e+02],[-7.6436264e+01, -7.2988701e+01, -1.4288957e+02],...,[-6.9915497e+01, -7.0040161e+01, -2.9220118e+01],[-7.9575409e+01, -8.7896751e+01, -2.2540152e+01],[ 1.4976151e+00, -3.9532539e+01,  2.4007713e+01]],[[ 1.6301501e-01, -7.5239571e+01,  1.4685207e+01],[-1.7506450e+02, -1.0348013e+02, -3.0651636e+01],[ 6.6081967e+00, -7.1792625e+01, -1.9930256e+01],...,[-9.5817398e+01, -8.4152191e+01, -4.7578541e+01],[-1.0253343e+02, -1.0277041e+02, -5.9465641e+01],[-6.5707291e+01, -9.5707497e+01,  1.8698233e+00]],[[ 5.0412319e+01, -2.1662691e+01,  5.3101387e+01],[ 3.1803602e+01, -8.5167076e+01,  2.6759926e+01],[ 3.0552992e+01, -4.0452545e+01,  1.7949617e+01],...,[-9.9841835e+01, -1.0824145e+02, -1.7334482e+01],[-1.1805286e+02, -1.4513156e+02, -2.8111979e+01],[-2.5438595e+01, -1.0570927e+02,  2.0523670e+01]]]],dtype=float32)

Iteration 0 :
total cost = 5050363000.0
content cost = 7877.685
style cost = 126257096.0





  • 梵高(星空)风格的波斯波利斯(伊朗)古城的美丽废墟

  • 伊斯帕汗陶瓷风格的居鲁士大帝之墓

  • 具有抽象蓝色液体绘画风格的湍流科学研究。

5 使用你自己的图像进行测试(可选练习)



  1. 单击笔记本上部选项卡中的"File -> Open"

  2. 转到"/images"并上传图像(要求:(WIDTH = 300, HEIGHT = 225)),例如将其重命名为"my_content.png"和"my_style.png"

  3. 从以下位置更改部分(3.4)中的代码:

    content_image = scipy.misc.imread("images/louvre.jpg")
    style_image = scipy.misc.imread("images/claude-monet.jpg")


    content_image = scipy.misc.imread("images/my_content.jpg")
    style_image = scipy.misc.imread("images/my_style.jpg")
  4. 重新运行单元(你可能需要重新启动笔记本计算机上部选项卡中的Kernel )。


  • 哪一层负责表示风格? STYLE_LAYERS
  • 你要运行算法多少次迭代? num_iterations
  • 内容和风格之间的相对权重是多少? alpha / beta

6 总结



  • 神经风格迁移是一种算法,给定内容图像C和风格图像S可以生成艺术图像
  • 它使用基于预训练的ConvNet的特征(隐藏层激活)。
  • 使用一个隐藏层的激活来计算内容损失函数。
  • 使用该层激活的Gram矩阵计算一层的风格损失函数。使用几个隐藏层可以获得整体风格损失函数。
  • 优化总损失函数以合成新图像。

这是本课程的最后编程练习。 恭喜,你已经完成了本课程在卷积网络上的所有编程练习!我们希望能在课程5-序列模型中同样看到你的身影。


神经风格迁移算法源于Gatys et al. (2015)。 Harish Narayanan和Github用户"log0"也写了很多精湛的文章,我们从中汲取了灵感。此实现中使用的预训练网络是VGG网络,这是Simonyan和Zisserman(2015)的工作成果。预先训练的权重来自MathConvNet团队的工作。

  • Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, (2015). A Neural Algorithm of Artistic Style (https://arxiv.org/abs/1508.06576)
  • Harish Narayanan, Convolutional neural networks for artistic style transfer. https://harishnarayanan.org/writing/artistic-style-transfer/
  • Log0, TensorFlow Implementation of “A Neural Algorithm of Artistic Style”. http://www.chioka.in/tensorflow-implementation-neural-algorithm-of-artistic-style
  • Karen Simonyan and Andrew Zisserman (2015). Very deep convolutional networks for large-scale image recognition (https://arxiv.org/pdf/1409.1556.pdf)
  • MatConvNet. http://www.vlfeat.org/matconvnet/pretrained/

