Diffusion-GAN: 将GAN与diffusion一起训练

paper:https://arxiv.org/abs/2206.02262

code:GitHub - Zhendong-Wang/Diffusion-GAN: Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion


第一行从左向右看是diffusion forward的过程,不断由 real image 进行 diffusion,第三行从右向左看是由noise逐步恢复成fake image的过程,第二行是鉴别器D,D对每一个timestep都进行鉴别。

Figure 1: Flowchart for Diffusion-GAN. The top-row images represent the forward diffusion process of a real image, while the bottom-row images represent the forward diffusion process of a generated fake image. The discriminator learns to distinguish a diffused real image from a diffused fake image at all diffusion steps.

in Figure 1. In Diffusion-GAN, the input to the diffusion process is either a real or a generated image, and the diffusion process consists of a series of steps that gradually add noise to the  image. The number of diffusion steps is not fixed, but depends on the data and the generator. We also design the diffusion process to be differentiable, which means that we can compute the derivative of the output with respect to the input. This allows us to propagate the gradient from the discriminator to the generator through the diffusion process, and update the generator accordingly. Unlike vanilla GANs, which compare the real and generated images directly, Diffusion-GAN compares the noisy versions of them, which are obtained by sampling from the Gaussian mixture distribution over the diffusion steps, with the help of our timestep-dependent discriminator. This distribution has the property that its components have different noise-to-data ratios, which means that some components add more noise than others. By sampling from this distribution, we can achieve two benefits: first, we can stabilize the training by easing the problem of vanishing gradient, which occurs when the data and generator distributions are too different; second, we can augment the data by creating different noisy versions of the same image, which can improve the data efficiency and the diversity of the generator. We provide a theoretical analysis to support our method, and show that the min-max objective function of Diffusion-GAN, which measures the difference between the data and generator distributions, is continuous and differentiable everywhere. This means that the generator in theory can always receive a useful gradient from the discriminator, and improve its performance.【G可以从D收到有用的梯度,从而提升G的性能】

主要贡献:

1) We show both theoretically and empirically how the diffusion process can be utilized to provide a model- and domain-agnostic differentiable augmentation, enabling data-efficient and leaking-free stable GAN training.【稳定了GAN的训练
2) Extensive experiments show that Diffusion-GAN boosts the stability and generation performance of strong baselines, including StyleGAN2 , Projected GAN , and InsGen , achieving state-of-the-art results in synthesizing photo-realistic images, as measured by both the Fréchet Inception Distance (FID)  and Recall score.【diffusion提升了原始只有GAN组成的框架的性能,例如styleGAN2,Projected GAN

Figure 2: The toy example inherited from Arjovsky et al. [2017]. The first row plots the distributions of data with diffusion noise injected for t. The second row shows the JS divergence and the optimal discriminator value with and without our noise injection.

Figure 4: Plot of adaptively adjusted maximum diffusion steps T and discriminator outputs of Diffusion-GANs.

To investigate how the adaptive diffusion process works during training, we illustrate in Figure 4 the convergence of the maximum timestep T in our adaptive diffusion and discriminator outputs.
We see that T is adaptively adjusted: The T for Diffusion StyleGAN2 increases as the training goes while the T for Diffusion ProjectedGAN first goes up and then goes down. Note that the T is adjusted according to the overfitting status of the discriminator. The second panel shows that trained with the diffusion-based mixture distribution, the discriminator is always well-behaved and provides useful learning signals for the generator, which validates our analysis in Section 3.4 and Theorem 1.

如图4左所示,随着训练过程的变化,扩散的timestep T也会自适应的改变(T通过鉴别器D过拟合的状态而改变);
如图4右所示,用基于扩散的混合分布训练的鉴别器总是表现良好,并为生成器G提供有用的学习信号。

Effectiveness of Diffusion-GAN for domain-agnostic augmentation(未知域增强的有效性)

25-Gaussians Example.

We conduct experiments on the popular 25-Gaussians generation task. The 25-Gaussians dataset is a 2-D toy data, generated by a mixture of 25 two-dimensional Gaussian distributions. Each data point is a 2-dimensional feature vector. We train a small GAN model, whose generator and discriminator are both parameterized by multilayer perceptrons (MLPs), with two 128-unit hidden layers and LeakyReLu nonlinearities.

Figure 5: The 25-Gaussians example. We show the true data samples, the generated samples from vanilla GANs, the discriminator outputs of the vanilla GANs, the generated samples from our Diffusion-GAN, and the discriminator outputs of Diffusion-GAN.

(1)groundtruth数据集的数据分布,在25个Gaussians example均匀分布;
(2)vanilla GANs的输出结果产生了mode collapsing,只在几个model上生成数据;
(3)vanilla GANs鉴别器输出很快就会彼此分离。这意味着发生了鉴别器的强烈过拟合,使得鉴别器停止为发生器提供有用的学习信号。
(4)Diffusion-GAN在25个example上均匀分布,意味着它在所有的model上学到了采样分布;
(5)Diffusion-GAN的鉴别器输出,D在持续的为G提供有用的学习信号

我们从两个角度来解释这种改进:
首先,non-leaking augmentation(无泄漏增强)有助于提供关于数据空间的更多信息;第二,自适应调整的基于扩散的噪声注入,鉴别器表现良好。

关于 Difffferentiable augmentation. (可微分增强)

As Diffusion-GAN transforms both the data and generated samples before sending them to the discriminator, we can also relate it to differentiable augmentation proposed for data-efficient GAN training. Karras et al introduce a stochastic augmentation pipeline with 18 transformationsand develop an adaptive mechanism for controlling the augmentation probability. Zhao et al. [2020] propose to use Color + Translation + Cutout as differentiable augmentations for both generated and real images.

While providing good empirical results on some datasets, these augmentation methods are developed with domain-specific knowledge and have the risk of leaking augmentation  into generation [Karras et al., 2020a]. As observed in our experiments, they sometime worsen the results when applied to a new dataset, likely because the risk of augmentation leakage overpowers the benefits of enlarging the training set, which could happen especially if the training set size is already sufficiently large.(在数据量足够大的情况下,数据增强带来的负面效果可能大于正面效果)

By contrast, Diffusion-GAN uses a differentiable forward diffusion process to stochastically transform the data and can be considered as both a domain-agnostic and a model-agnostic augmentation method. In other words, Diffusion-GAN can be applied to non-image data or even latent features, for which appropriate data augmentation is difficult to be defined, and easily plugged into an existing GAN to improve its generation performance. Moreover, we prove in theory and show in experiments that augmentation leakage is not a concern for Diffusion-GAN. Tran et al. [2021] provide a theoretical analysis for deterministic non-leaking transformation with differentiable and invertible mapping functions. Bora et al. [2018] show similar theorems to us for specific stochastic transformations, such as Gaussian Projection, Convolve+Noise, and stochastic Block-Pixels, while our Theorem 2 includes more satisfying possibilities as discussed in Appendix B.

Diffusion-GAN: Training GANs with Diffusion 解读相关推荐

  1. Denoising Diffusion GAN:Tackling the Generative Learning Trilemma with Denoising Diffusion GANs

    Tackling the Generative Learning Trilemma with Denoising Diffusion GANs 用Diffusion GANs解决生成学习的三难困境 p ...

  2. ContraD论文部分翻译与解读(Training GANs with Stronger Augmentations via Contrastive Discriminator)

    Training GANs with Stronger Augmentations via Contrastive Discriminator 借助对比判别器实现的通过更强的数据增强来训练生成对抗网络 ...

  3. Paper之BigGAN:《Large Scale Gan Training For High Fidelity Natural Image Synthesis》翻译与解读

    Paper之BigGAN:<Large Scale Gan Training For High Fidelity Natural Image Synthesis>翻译与解读 目录 效果 1 ...

  4. Diffusion Models和GANs结合

    Diffusion Models专栏文章汇总:入门与实战 前言:作为Diffusion Models最成功的前辈们:flow based models.VAEs.GANs,最近几个月已经有不少将dif ...

  5. Large scale GAN training for high fidelity natural image synthesis解读

    <Large scale GANtraining for high fidelity natural image synthesis>这篇文章对训练大规模生成对抗网络进行了实验和理论分析, ...

  6. Paper之BigGAN:ICLR 2019最新论文《LARGE SCALE GAN TRAINING FOR HIGH FIDELITY NATURAL IMAGE SYNTHESIS》(未完待续)

    Paper之BigGAN:ICLR 2019最新论文<LARGE SCALE GAN TRAINING FOR HIGH FIDELITY NATURAL IMAGE SYNTHESIS> ...

  7. GAN开山之作论文解读及感想

    GAN开山之作论文解读及感想 ​ 研一生涯快结束了,看了一些论文,最终还是决定继续深度学习,本科阶段学过了TensorFlow,也还算有一些基础吧,了解了一下GAN生成对抗网络,写一写学习心得 GAN ...

  8. 《Improved Techniques for Training GANs》-论文阅读笔记

    <Improved Techniques for Training GANs>-论文阅读笔记 文章目录 <Improved Techniques for Training GANs& ...

  9. Improved Techniques for Training GANs

    Improved Techniques for Training GANs paper code Introduce 对抗网络主要有两个应用:半监督学习和生成视觉相似图片.对抗网络的目的要训练生成网络 ...

最新文章

  1. 用JavaScript实现在网页中显示时间表
  2. httpTomcat
  3. iOS开发那些事-故事板实现标签导航
  4. Scipy Lecture Notes学习笔记(一)Getting started with Python for science 1.2. The Python language
  5. Python NLPIR2016 与 wordcloud 结合生成中文词云
  6. Python 使用穷举法求两个数的最大公约数。
  7. jboss修改服务器端口,改了默认端口的jboss不能用shutdown.sh关闭,怎样解决
  8. Flex学习笔记(1)——入门,HelloFlex
  9. python 存redis失败无提示_python如何关闭redis
  10. V4L2应用程序框架--一【转】
  11. 网站成功的三十三个法则
  12. aplay amixer用法详解
  13. opencv31:哈里斯角检测|Harris Corner
  14. npm 安装ionic
  15. EMC、EMI、ESD、EMS区别 最清晰的解释送给你
  16. npm--踩坑--npm audit fix 解决方法
  17. PDF复制乱码 -- 原因及解决方案
  18. Java import 和 import static
  19. 频谱分析仪的性能参数
  20. .NET Framework 概述

热门文章

  1. 汇编语言与汇编器(目前有哪些汇编语言与汇编器)
  2. 二、Vue2.0项目结构内容及配置解析
  3. 移动端H5兼容性问题
  4. 效率工具 001 | 手把手教你满速(不限速)下载百度网盘文件
  5. 正则表达式-支付宝账号验证
  6. 2012搜狗校园招聘笔试题
  7. js 截取字符串里的IP和port
  8. vue之使用iview插件实现列表展示或table展示
  9. 2019年电气试验作业安全生产模拟考试题库及答案
  10. Python os.makedirs来判断某个路径下下的文件夹,进行文件夹强制创建。