ai生成图片是什么技术

by Thomas Simonini

通过托马斯·西蒙尼(Thomas Simonini)

人工智能如何学习生成猫的图片 (How AI can learn to generate pictures of cats)

In 2014, the research paper Generative Adversarial Nets (GAN) by Goodfellow et al. was a breakthrough in the field of generative models.

2014年，研究论文Generative Adversarial Nets (GAN)由Goodfellow等人撰写。是生成模型领域的突破。

Leading researcher Yann Lecun himself called adversarial nets “the coolest idea in machine learning in the last twenty years.”

首席研究员Yann Lecun自己称对抗网络是“过去20年来机器学习中最酷的想法。”

Today, thanks to this architecture, we’re going to build an AI that generates realistic pictures of cats. How awesome is that?!

今天，由于有了这种架构，我们将构建一个可以生成逼真的猫照片的AI。那太棒了!!

To view the full working code, see my Github repository. It will help if you already have some experience in Python, Deep Learning and Tensorflow, and CNNs (Convolutional Neural Nets).

要查看完整的工作代码，请参阅我的Github存储库。如果您已经对Python，深度学习和Tensorflow以及CNN(卷积神经网络)有一定的经验，它将对您有所帮助。

If you new in Deep Learning, please check this excellent series of articles:

如果您是深度学习的新手，请查看以下出色的系列文章：

Machine Learning is Fun!The world’s easiest introduction to Machine Learningmedium.com

机器学习很有趣！ 全球最简单的Machine Learning medium.com简介

什么是DCGAN？ (What is DCGAN?)

Deep Convolutional Generative Adverserial Networks (or DCGAN) are a deep learning architecture that generate outputs similar to the data in the training set.

深度卷积生成对抗网络(DCGAN)是一种深度学习架构，其生成的输出类似于训练集中的数据。

This model replaces the fully connected layers of the generative adversarial network model with convolution layers.

该模型用卷积层替换了生成对抗网络模型的完全连接层。

To explain how DCGAN works, let’s use the metaphor of the art expert and the counterfeiter.

为了说明DCGAN的工作原理，让我们使用艺术专家和伪造者的隐喻。

The counterfeiter (a.k.a. “the generator”) tries to produce fake Van Gogh paintings and pass them off as real.

伪造者(又名“生成者”)试图制作伪造的梵高画作，并将其作为真实作品传播出去。

On the other hand, the art expert (a.k.a., “the discriminator”) tries to catch the counterfeiter by using their knowledge of real Van Gogh paintings.

另一方面，艺术专家(又名“辨别者”)试图利用对梵高真实绘画的了解来发现假冒者。

Over time, the art expert gets better at detecting counterfeit paintings, and the counterfeiter gets better at faking them.

随着时间的流逝，艺术专家在检测伪造画上会变得更好，而伪造品在伪造画上也会变得更好。

As we see, DCGANs are composed of two separate deep neural networks competing against each other.

如我们所见，DCGAN由相互竞争的两个独立的深度神经网络组成。

The generator is a counterfeiter trying to produce seemingly real data. It has no idea of what the real data is, but it learns to adjust from the feedback of the other model.生成器是一个伪造者，试图产生看似真实的数据。它不知道实际数据是什么，但会从其他模型的反馈中学习进行调整。
The discriminator is a inspector trying to determine what the fake counterfeit data is (by comparing it with real data), while trying to not raise false positives on the real data. The output results of this model will serve for the backpropagation of the generator.

鉴别器是一名检查员，试图确定伪造的伪造数据是什么(通过将其与真实数据进行比较)，同时尝试不对真实数据造成误报。该模型的输出结果将用于生成器的反向传播。

The generator takes a random noise vector and generates a picture.生成器获取随机噪声矢量并生成图片。
This picture is fed into the discriminator, which compares the training set against the generated image.该图片被馈送到鉴别器，鉴别器将训练集与生成的图像进行比较。
The discriminator returns a number between 0 (fake image) and 1 (real image).鉴别器返回介于0(伪图像)和1(真实图像)之间的数字。

让我们创建一个DCGAN！ (Let’s create a DCGAN!)

Now, we’re ready to create our AI.

现在，我们准备创建我们的AI。

In this part, we will focus on the main elements of our model. If you want to check out the whole code, use the notebook here.

在这一部分中，我们将专注于模型的主要元素。如果您想查看整个代码，请在此处使用笔记本。

输入项 (Inputs)

Here, we create the inputs placeholders: inputs_real for the discriminator and inputs_z for the generator.

在这里，我们创建输入占位符：鉴别符的inputs_real和生成器的inputs_z。

Note that we use two learning rates, one for the generator and one for the discriminator.

请注意，我们使用两种学习率，一种用于生成器，另一种用于鉴别器。

DCGANs are very sensitive to hyperparameters, so it’s very important to tune them precisely.

DCGAN对超参数非常敏感，因此精确调整它们非常重要。

鉴别器和生成器 (The discriminator and the generator)

We use tf.variable_scope for two reasons.

我们使用tf.variable_scope有两个原因。

First, we want to make sure that all variables names start with generator / discriminator. This will help out later when we train the two networks.

首先，我们要确保所有变量名称都以generator / discriminator开头。这将在以后训练两个网络时提供帮助。

Second, we want to reuse these networks with different inputs:

其次，我们想以不同的输入重用这些网络：

For the generator: we’re going to train it, but also sample fake images from it after training.对于生成器：我们将对其进行训练，但还要在训练后从中采样伪造的图像。
For the discriminator: we need to share variables between the fake and real input images.对于鉴别器：我们需要在假输入图像和真实输入图像之间共享变量。

Now let’s create the discriminator. Remember, it takes as an input a real or fake image and outputs a score.

现在让我们创建鉴别器。请记住，它以真实或伪造图像作为输入并输出分数。

Some technical remarks:

一些技术说明：

The principle is to double the filter size at each convolution layer.

原理是在每个卷积层将滤镜大小加倍。
It’s not recommended to use downsampling. Instead, we use only strided convolutional layers.不建议使用下采样。相反，我们仅使用跨步卷积层。
We use batch normalization at each layer (except for the input layer), because it reduces the covariance shift. For more information, check this great article.

我们在每一层(输入层除外)使用批处理归一化，因为它可以减少协方差漂移。有关更多信息，请查看这篇出色的文章。
We utilize Leaky ReLU as an activation function, because it helps to avoid the vanishing gradient effect.我们利用Leaky ReLU作为激活功能，因为它有助于避免消失的梯度效应。

Then, we create the generator. Remember, it takes as an input a random noise vector (z) and outputs a fake image, thanks to transposed convolution layers.

然后，我们创建生成器。请记住，它将随机噪声矢量(z)作为输入并输出伪造的图像，这要归功于转置的卷积层。

The idea is that at each layer we halve the filter size, and double the size of the picture.

想法是在每一层我们将滤镜大小减半，并将图片大小加倍。

The generator has been found to perform best using tanh as the output activation function.

使用tanh作为输出激活功能时，已发现该发生器的性能最佳。

鉴别器和发电机损耗 (Discriminator and generator losses)

Because we train the generator and discriminator at the same time, we need to calculate losses for both networks.

因为我们同时训练生成器和鉴别器，所以我们需要计算两个网络的损耗。

We want the discriminator to output 1 when it “thinks” an image is real, and 0 for fake images. Therefore, we need to set up the losses to reflect that.

我们希望鉴别器在“认为”图像是真实图像时输出1，对于伪图像则输出0。因此，我们需要设置损失以反映这一点。

The discriminator loss is the sum of loss for real and fake images:

鉴别符损失是真实和伪造图像损失的总和：

d_loss = d_loss_real + d_loss_fake

d_loss_real is the loss when the discriminator predicts an image is fake, when in fact it was a real image. It is calculated as follows:

d_loss_real是辨别器预测图像是假的(实际上是真实图像)时的损失。计算公式如下：

Use d_logits_real and labels are all 1 (since all real data is real)

使用d_logits_real和标签都是1(因为所有真实数据都是真实的)
labels = tf.ones_like(tensor) * (1 - smooth) We use label smoothing: it means reducing the labels a bit from 1.0 to 0.9 in order to help the discriminator generalize better.

labels = tf.ones_like(tensor) * (1 - smooth)我们使用标签平滑：这意味着将标签从1.0减少到0.9 为了帮助鉴别者更好地概括。

d_loss_fake is the loss when the discriminator predict an image is real, when in fact is was a fake image.

d_loss_fake是鉴别者预测图像是真实的，而实际上是假图像时的损失。

Use d_logits_fakeand labels are all 0.

使用d_logits_fake和标签都是0。

The generator loss again uses the d_logits_fake from the discriminator. This time the labels are all 1, because the generator wants to fool the discriminator.

生成器损耗再次使用鉴别器的d_logits_fake 。这次标签全为1，因为生成器要欺骗鉴别器。

优化器 (Optimizers)

After calculating the losses, we need to update the generator and discriminator separately.

计算损失后，我们需要分别更新发生器和鉴别器。

To do this, we need to get the variables for each part by using tf.trainable_variables(). This creates a list of all the variables we’ve defined in our graph.

为此，我们需要使用tf.trainable_variables()获得每个零件的变量。这将创建我们在图形中定义的所有变量的列表。

训练 (Training)

Here, we’re implementing the training function.

在这里，我们正在实现训练功能。

The idea is relatively simple:

这个想法相对简单：

We’re saving the model each five epochs.我们每五个时期保存一次模型。
We’re saving a picture in images folder each ten batches trained.每训练十批，我们就会将图片保存在images文件夹中。
We’re displaying the g_loss , d_loss and the image generated each 15 epochs. This is for a simple reason: Jupyter notebook can bug if too many pictures are displayed.

我们正在显示g_loss , d_loss和每15个时代生成的图像。原因很简单：如果显示过多图片，Jupyter笔记本可能会出错。
Or, we can directly generate real images by loading the saved model (this will save you 20 hours of training).或者，我们可以通过加载保存的模型直接生成真实图像(这将为您节省20个小时的培训)。

如何运行 (How to run it)

You can’t run this on your personal computer — unless you have your own GPUs or are ready to wait maybe 10 years!

您不能在个人计算机上运行它-除非您拥有自己的GPU或准备等待大约10年！

Instead, you must use cloud GPU services, such as AWS or FloydHub.

相反，您必须使用云GPU服务，例如AWS或FloydHub。

Personally, I trained this DCGAN for 20 hours with Microsoft Azure and their Deep Learning Virtual Machine.

我个人使用Microsoft Azure及其深度学习虚拟机对该DCGAN进行了20小时的培训。

Disclaimer: I don’t have any business relations with Azure. I just loved their excellent customer service!

免责声明：我与Azure没有任何业务关系。我只是喜欢他们的优质客户服务！

If you have trouble running it on a virtual machine, follow this excellent article here.

如果您无法在虚拟机上运行它，请在此处关注这篇出色的文章。

That’s all, I hope that this tutorial has been helpful!

就这样，我希望本教程对您有所帮助！

If you’ve improved the model, don’t hesitate to make a pull request.

如果您改进了模型，请随时提出请求。

If you have any thoughts, comments, or want to show me your results, feel free to comment below or send me an email: hello@simoninithomas.com, or tweet me @ThomasSimonini.

如果您有任何想法，意见，或想向我展示您的结果，请在下面发表评论，或给我发送电子邮件：hello@simoninithomas.com，或向我发送@ThomasSimonini 。

And if you liked my article, please click the ? below so other people will see this here on Medium. And don’t forget to follow me!

如果您喜欢我的文章，请单击“？”。下面，以便其他人可以在Medium上看到。并且不要忘记跟随我！

Cheers!

干杯!

翻译自: https://www.freecodecamp.org/news/how-ai-can-learn-to-generate-pictures-of-cats-ba692cb6eae4/