我已经有两年 ML 经历，这系列课主要用来查缺补漏，会记录一些细节的、自己不知道的东西。

这是我的李宏毅老师 GAN 系列第11篇笔记，GAN 系列：

1：Basic Idea
2：Conditional GAN
3：Unsupervised Conditional Generation
4：Theory behind GAN
5：fGAN: General Framework of GAN
6：Tips for improving GAN
7：Feature Extraction
8：Intelligent Photo Editing
9：Improving Sequence Generation by GAN
10：Evaluation

本节内容综述

本节课由助教 吴宗翰 和 陈延昊 讲解，日期为 2018年5月28日。
本节课将先复习 GAN ，然后讲解近年来较为突出的 6 个 GAN ：SAGAN、BigGAN、SinGAN、GauGAN、GANILLA、NICE-GAN。
在复习 Recap 中，探讨了 GAN 的发展树。
首先是 Self-attention GAN，SAGAN。
接下来是 BigGAN 。豪华版的 SAGAN 。
介绍 SinGAN ，一张图片进行训练。
接下来是 GauGAN 。测试网站：http://nvidia-research-mingyuliu.com/gaugan/，很有趣。
接下来是 GANILLA ，擅长转换风格。
最后是 NICE-GAN ，将 Discriminator 的前半部分当成 encoder 。

文章目录

本节内容综述
小细节
- Recap
- Self-attention GAN, SAGAN
- - Self-attention
  - Spectral normalization (SN) for both G and D
  - Imbalanced learning rate for G and D (TTUR)
  - Results
  - - Ablation Study
    - Visualization of attention maps
- BigGAN
- - Truncation Trick
  - Some insights about training stability
- SinGAN
- - One training data is enough!
  - - Trade-off
  - Progressively train
  - Many application fields
  - - Super Resolution
    - Paint image
    - Image editing
    - Image harmonization
- GauGAN
- - SPADE
  - - Unconditional Normalization
  - Conditional Normalization (SPADE)
  - - SPADE ResBlk
  - Use encoder (Style)
- GANILLA
- - CycleGAN, DualGAN, and CartoonGAN
  - Comparison of G
  - Propose an extensive children's books illustration dataset
- NICE-GAN
- - No independent encoder
  - Result

小细节

Recap

如上：

在左边的蓝色分支，是在数学上尝试提升 GAN 的性能；
右侧在图像领域的应用。

Self-attention GAN, SAGAN

Self-attention

如上，在 L×LL\times LL×L 矩阵中，实际上是查看两个元素间相互的重要性。

Spectral normalization (SN) for both G and D

In SNGAN, SN is only applied to D. Here SN is for both G and D.

现在，直接在 pytorch 中调用函数就能实现 SN ：
WSN=Wσ(W),σ(W)=max⁡h:h≠0∣∣Wh∣∣2∣∣h∣∣W_{SN} = \frac{W}{\sigma(W)}, \sigma(W)=\max_{h:h \neq 0}\frac{||Wh||_2}{||h||}WSN=σ(W)W,σ(W)=h:h=0max∣∣h∣∣∣∣Wh∣∣2

>>> m = spectral_norm(nn.Linear(20, 40))
>>> m
Linear(in_features=20, out_features=40, bias=True)
>>> m.weight_u.size()
torch.Size([40])

此外，在原论文中，提出了一种近似的、快速的 SN 算法。

Imbalanced learning rate for G and D (TTUR)

Two-Timescale Update Rule (TTUR)

Separate learning rates for G and D:

lr for D: 0.0004
lr for G: 0.0001

Results

上图中，第一排为 FID ，第二排为 Inception 指标。

Ablation Study

Visualization of attention maps

如上，Self-attention 确实有作用。

BigGAN

豪华版的 SAGAN ：

Based on SAGAN and SN;
Two to four times as many parameters;
Batch size * 8;
Truncation Trick;
Some insights about training stability.

Truncation Trick

在 GAN 中，我们往往要面临准确度与多样性的取舍。如上，Truncation Trick 要求我们把多样性指标降低到一定数值，如 0.04 那批，再输入到神经网络中。如此，我们希望 model 可以专精于某些图片，并能够进行的细致的修正。但是，但用这个方法也可能产生如右侧的坏图片。因此作者还提出了一些方法，具体见原文。

Some insights about training stability

如山，越大的 model ，学的越快，而坏得也越快。

如上， model 自己可能会突然坏掉。

SinGAN

One training data is enough!
Progressively train;
Many application fields.

One training data is enough!

如上，把一张图片切成几千几万张即可。

Trade-off

For a 200*200 image:
150 *150: ~2500 对于 200 乘 200 的图片，使用 150 方的大小去切，则会产生大约 2500 张图片。
但是这还有问题，即便 MNIST 这种 28*28 的数据，其数据集也有 55000 的数据量。因此相比之下，One training data 可能还是太少了。

因此，必须切不同的大小。还要考虑许多其他问题，如，局部信息怎么反映全局信息。

如上，可以看出 SinGAN 产生的图片效果比较理想。

Progressively train

如上，通过“循序渐进”地训练，首先生成“大局的”、“粗糙的”图片。

如上，G 的设计。

Many application fields

Super Resolution

高解析度纹理。

Paint image

由抽象图片绘制真实风格图片。

Image editing

图片恢复。

Image harmonization

融入图片。

GauGAN

GauGAN 中，我们先规定每个颜色代表什么。如上，我们规定灰色是云、棕色是树…随后便可绘制抽象图片，再结合左侧的风景照（决定了生成图片的风格），然后生成图片。

SPADE

Unconditional Normalization

如上，传统的 Normalization ，不管输入什么，归一数据已经不会再改变。

Conditional Normalization (SPADE)

如上，对于每个输入，都会取其中的两个指标 γ\gammaγ 与 β\betaβ ，接着与 Batch Norm 进行运算。这保证了每张图片进行 Normalization 都会形成独特的形状（即 Normalization 不再是减去除以相同的数据）。

SPADE ResBlk

如上是其 ResBlk 。

Use encoder (Style)

如上，其风格转换与 VAE-GAN 有点类似。

其 Generator 如上。

Discriminator 如上。

GANILLA

Unpaired data
Either Style or Content (e.g. CycleGAN, DualGAN, CartoonGAN)
Propose an extensive children’s books illustration dataset

CycleGAN, DualGAN, and CartoonGAN

如上，风格变了，内容可能会有缺失；内容没有确实，风格转换也不大。

Comparison of G

如上，GANILLA 结构对比如上。

如上，作者认为在前面几层，会提取出形状的咨询，因此要连到后面的层。

Propose an extensive children’s books illustration dataset

如上，使用各种风格的童书绘本（颜色较为鲜艳）作为数据集。

其参数量最少的。

如上，GANILLA 在风格与内容上取得了较为不错的平衡。

宫崎骏风格的图片如上。

NICE-GAN

No independent encoder
Model structure

No independent encoder

如上，把 Discriminator 的前半部，作为 Generator 的 Encoder 。

作者的依据为，其认为：

Discriminator 会起到 Encoding 的作用，以为其分类前会抽取图片特征。

Advantages of sharing encoder and D

No need to train encoder independently
The encoder is trained more efficiently

Details of training D

Encoder: minimize loss
Discriminator: maximize loss

上述问题是目标值训练中存在的矛盾，作者提出了：

只有在训练 Discriminator 时，Encoder 才更新。

Result

如上，时猫狗、夏冬的图片互转。

【李宏毅2020 ML/DL】P84 SAGAN, BigGAN, SinGAN, GauGAN, GANILLA, NICE | More About GAN 2020相关推荐

【李宏毅2020 ML/DL】P82 Generative Adversarial Network | Improving Sequence Generation by GAN
我已经有两年 ML 经历,这系列课主要用来查缺补漏,会记录一些细节的.自己不知道的东西. 这是我的李宏毅老师 GAN 系列第9篇笔记,GAN 系列: 1:Basic Idea 2:Conditiona ...
【李宏毅2020 ML/DL】P78 Generative Adversarial Network | fGAN: General Framework of GAN
我已经有两年 ML 经历,这系列课主要用来查缺补漏,会记录一些细节的.自己不知道的东西. 这是我的李宏毅老师 GAN 系列第5篇笔记,GAN 系列: 1:Basic Idea 2:Conditiona ...
【李宏毅2020 ML/DL】P1 introduction
[李宏毅2020 ML/DL]P1 introduction 本节主要介绍了 DL 的15个作业英文大意 Regression: 回归分析 Classification: 分类 RNN: 循环神经网 ...
【李宏毅2020 ML/DL】P86-87 More about Domain Adaptation
我已经有两年 ML 经历,这系列课主要用来查缺补漏,会记录一些细节的.自己不知道的东西. 本节内容综述本节课由助教 Chao Brian 讲解. 首先讲解些领域适配的基础内容,包括名词.定义等. 接 ...
【李宏毅2020 ML/DL】P59 Unsupervised Learning - Auto-encoder
我已经有两年 ML 经历,这系列课主要用来查缺补漏,会记录一些细节的.自己不知道的东西. 已经有人记了笔记(很用心,强烈推荐):https://github.com/Sakura-gh/ML-note ...
【李宏毅2020 ML/DL】P58 Unsupervised Learning - Neighbor Embedding | LLE, t-SNE
我已经有两年 ML 经历,这系列课主要用来查缺补漏,会记录一些细节的.自己不知道的东西. 已经有人记了笔记(很用心,强烈推荐):https://github.com/Sakura-gh/ML-note ...
【李宏毅2020 ML/DL】P15 Why Deep-
我已经有两年 ML 经历,这系列课主要用来查缺补漏,会记录一些细节的.自己不知道的东西. 已经有人记了笔记(很用心,强烈推荐): https://github.com/Sakura-gh/ML-not ...
【李宏毅2020 ML/DL】P14 Tips for training DNN | 激活函数、Maxout、正则、剪枝 Dropout
我已经有两年 ML 经历,这系列课主要用来查缺补漏,会记录一些细节的.自己不知道的东西. 已经有人记了笔记(很用心,强烈推荐): https://github.com/Sakura-gh/ML-not ...
李宏毅svm_李宏毅2020 ML/DL补充Structured Learning Structured SVM
李宏毅2020 ML/DL补充Structured Learning Structured SVM [李宏毅2020 ML/DL]补充:Structured Learning: Structured ...

【李宏毅2020 ML/DL】P84 SAGAN, BigGAN, SinGAN, GauGAN, GANILLA, NICE | More About GAN 2020