GAN生成对抗网络论文翻译（二）

英语论文，每天翻译一节（在家自学控制力还是太差），纯属自己翻译，小白一只，如果您能提出建议或者翻译修改，将非常感谢，首先谢谢！
2 Generative Adversarial Networks
As its name implies, GAN is a generative model that learns to make real-like data adversarially [36] containing two components, the generator G and the discriminator D. G takes the role of producing real-like fake samples from the latent variable z, whereas D determines whether its input comes from G or real data space. D outputs a high value as it determines that its input
2
Table 1: An overview of GANs discussed in Section 2 and 3.
Subject Topic Reference Object f-divergence GAN [36], f-GAN [89], LSGAN [76] functions IPM WGAN [5], WGAN-GP [42], FISHER GAN [84], McGAN [85], MMDGAN [68]
Architecture
DCGAN DCGAN [100] Hierarchy StackedGAN [49], GoGAN [54], Progressive GAN [56] Auto encoder BEGAN [10], EBGAN [143], MAGAN [128]
Issues
Theoretical analysis Towards principled methods for training GANs [4] Generalization and equilibrium in GAN [6] Mode collapse MRGAN [13], DRAGAN [61], MAD-GAN [33], Unrolled GAN [79]
Latent space
Decomposition CGAN [80], ACGAN [90], InfoGAN [15], ss-InfoGAN [116] Encoder ALI [26], BiGAN [24], Adversarial Generator-Encoder Networks [123] VAE VAEGAN [64], α-GAN [102] Table 2: Categorization of GANs applied for various topics.
Domain Topic Reference
Image
Image translation Pix2pix [52], PAN [127], CycleGAN [145], DiscoGAN [57] Super resolution SRGAN [65] Object detection SeGAN [28], Perceptual GAN for small object detection [69] Object transﬁguration GeneGAN [144], GP-GAN [132] Joint image generation Coupled GAN [74] Video generation VGAN [125], Pose-GAN [126], MoCoGAN [122] Text to image Stack GAN [49], TAC-GAN [18] Change facial attributes SD-GAN [23], SL-GAN [138], DR-GAN [121], AGEGAN [3]
Sequential data
Music generation C-RNN-GAN [83], SeqGAN [141], ORGAN [41] Text generation RankGAN [73] Speech conversion VAW-GAN [48] Semi-supervised learning SSL-GAN [104], CatGAN [115], Triple-GAN [67]
Others
Domain adaptation DANN [2], CyCADA [47] Unsupervised pixel-level domain adaptation [12] Continual learning Deep generative replay [110] Medical image segmentation DI2IN [136], SCAN [16], SegAN [134] Steganography Steganography GAN [124], Secure steganography GAN [109]
is more likely to be real. G and D compete with each other to achieve their individual goals, thus generating the term adversarial. This adversarial learning situation can be formulated as Equation 1 with parametrized networks G and D. pdata(x) and pz(z) in Equation 1 denote the real data probability distribution deﬁned in the data space X and the probability distribution of z deﬁned on the latent space Z, respectively. min G max D V (G,D) = min G max D Ex∼pdata [logD(x)] +Ez∼pz [log(1−D(G(z))]. (1) V (G,D) is a binary cross entropy function that is commonly used in binary classiﬁcation problems [76]. Note that G maps z from Z into the element of X, whereas D takes an input x and distinguishes whether x is a real sample or a fake sample generated by G. As D wants to classify real or fake samples, V (G,D) is a natural choice for an objective function in aspect of the classiﬁcation problem. From D’s perspective, if a sample comes from real data, D will maximize its output, while if a sample comes from G, D will minimize its output; thus, the log(1−D(G(z))) term appears in Equation 1. Simultaneously, G wants to deceive D, so it tries to maximize D’s output when a fake sample is presented to D. Consequently, D tries to maximize V (G,D) while G tries to minimize V (G,D), thus forming the minimax relationship in Equation 1. Figure 2 shows an illustration of the GAN. Theoretically, assuming that the two models G and D both have suﬃcient capacity, the equilibrium between G and D occurs when pdata(x) = pg(x) and D always produces 1 2 where pg(x) means a probability distribution of the data provided by the generator [36]. Formally, for ﬁxed G the optimal discriminator D? is D?(x) = pg(x) pg(x)+pdata(x) which can be shown by diﬀerentiating Equation 1. If we plug in the optimal D? into Equation 1, the equation becomes the Jensen Shannon Divergence (JSD) between pdata(x) and pg(x). Thus, the optimal generator minimizing JSD(pdata||pg) is the data distribution pdata(x) and D becomes 1 2 by substituting the optimal generator into the optimal D? term.
3
Figure 2: Generative Adversarial Network
Beyond the theoretical support of GAN, the above paragraph leads us to infer two points. First, from the optimal discriminator in the above, GAN can be connected into the density ratio trick [102]. That is, the density ratio between the data distribution and the generated data distribution as follows: Dr(x) = pdata(x) pg(x) = p(x|y = 1) p(x|y = 0) = p(y = 1|x) p(y = 0|x) = D?(x) 1−D?(x) (2) where y = 0 and y = 1 indicate the generated data and the real data, respectively and p(y = 1) = p(y = 0) is assumed. This means that GAN addresses the intractability of the likelihood by just using the relative behavior of the two distributions [102], and transferring this information to the generator to produce real-like samples. Second, GAN can be interpreted to measure the discrepancy between the generated data distribution and the real data distribution and then learn to reduce it. The discriminator is used to implicitly measure the discrepancy. Despite the advantage and theoretical support of GAN, many shortcomings has been found due to the practical issues and inability to implement the assumption in theory including the inﬁnite capacity of the discriminator. There have been many attempts to solve these issues by changing the objective function, the architecture, etc. Holding the fundamental framework of GAN, we assess variants of the object function and the architectures proposed for the development of GAN. We then focus on the crucial failures of GAN and how to address those issues.
2.1 Object Functions The goal of generative models is to match the real data distribution pdata(x) from pg(x). Thus, minimizing diﬀerences between two distributions is a crucial point for training generative models. As mentioned above, standard GAN [36] minimizes JSD(pdata||pg) estimated by using the discriminator. Recently, researchers have found that various distances or divergence measures can be adopted instead of JSD and can improve the performance of the GAN. In this section, we discuss how to measure the discrepancy between pdata(x) and pg(x) using various distances and object functions derived from these distances.
2.1.1 f-divergence The f-divergence Df(pdata||pg) is one of the means to measure diﬀerences between two distributions with a speciﬁc convex function f. Using the ratio of the two distributions, the f-divergence for pdata and pg with a function f is deﬁned as follows: Df(pdata||pg) =ZX pg(x)fpdata(x) pg(x) dx (3) It should be noted that Df(pdata||pg) can act as a divergence between two distributions under the conditions that f is a convex function and f(1) = 0 is satisﬁed. Because of the condition f(1) = 0, if two distributions are equivalent, their ratio becomes 1 and their divergence goes to 0. Though f is termed a generator function [89] in general, we call f an f-divergence function to avoid confusion with the generator G. f-GAN [89] generalizes the GAN objective function in terms of f-divergence under an arbitrary convex function f. As we do not know the distributions exactly, Equation 3 should be estimated through a tractable form such as an expectation form. By using the convex conjugate f(u) =
4
Table 3: GANs using f-divergence. Table reproduced from [89].
GAN Divergence Generator f(t) KLD tlogt GAN [36] JSD - 2log2 tlogt−(t + 1)log(t + 1) LSGAN [76] Pearson X2 (t−1)2 EBGAN [143] Total Variance |t−1| supt∈domf?(tu−f?(t)), Equation 3 can be reformulated as follows: Df(pdata||pg) =ZX pg(x) sup t∈domf?tpdata(x) pg(x) −f?(t)dx (4) ≥ sup T∈TZX T(x)pdata(x)−f? (T(x))pg(x)dx (5) = sup T∈TEx∼pdata[T(x)]−Ex∼pg[f?(T(x))] (6) where f? is a Fenchel conjugate [29] of a convex function f and domf? indicates a domain of f?. Equation 5 follows from the fact that the summation of the maximum is larger than the maximum of the summation and T is an arbitrary function class that satisﬁes X → R. Note that we replace t in Equation 4 by T(x) : X → domf? in Equation 5 to make t involved inRX. If we express T(x) in the form of T(x) = a(Dω(x)) with a(·) : R→ domf? and Dω(x) : X →R, we can interpret T(x) as the parameterized discriminator with a speciﬁc activation function a(·). We can then create various GAN frameworks with the speciﬁed generator function f and activation function a using the parameterized generator Gθ and discriminator Tω. Similar to the standard GAN [36], f-GAN ﬁrst maximizes the lower bound in Equation 6 with respect to Tω to make the lower bound tight to Df(pdata||pg), and then minimizes the approximated divergence with respect to Gθ to make pg(x) similar to pdata(x). In this manner, f-GAN tries to generalize various GAN objectives by estimating some types of divergences given f as shown in Table 3 Kullback-Leibler Divergence (KLD), reverse KLD, JSD and other divergences can be derived using the f-GAN framework with the speciﬁc generator function f, though they are not all represented in Table 3. Among f-GAN based GANs, LSGAN is one of the most widely used GANs due to its simplicity and high performance; we brieﬂy explain LSGAN in the next paragraph. To summarize, f-divergence Df(pdata||pg) in Equation 3 can be indirectly estimated by calculating expectations of its lower bound to deal with the intractable form with the unknown probability distributions. f-GAN generalizes various divergences under an f-divergence framework and thus it can derive the corresponding GAN objective with respect to a speciﬁc divergence.

2生成性对抗网络

顾名思义，GAN是一个生成模型，它对抗地学习生成包含两个分量的类真数据[36]，生成器G和鉴别器D。G扮演从潜在变量z生成类真假样本的角色，而D判断其输入是来自G还是真实数据空间。D输出一个高值，因为它确定其输入更可能是真实。G和D相互竞争以实现各自的目标，因此产生了“对抗”一词。这种对抗性学习情况可以用参数化网络G和D表示为方程1。方程1中p_data (x)和p_z (z)分别表示在数据空间x中定义的真实数据概率分布和在潜在空间z中定义的z的概率分布。

V（G，D）是二元交叉熵函数，常用于二元分类问题[76]。注意，G将小z从大Z映射到X的元素，而D接受一个输入x，并区分X是真实样本还是由G生成的假的样本。
由于D想要对真实或假的样本进行分类，V（G，D）是目标函数在分类问题方面的自然选择。从D的观点来看，如果一个样本来自真实数据，D将最大化它的输出，而如果一个样本来自G，D会最小化它的输出，因此，log(1-D(G(z)))项出现在等式1中。同时，G想要欺骗D，所以当伪样本被呈现给D时，它试图最大化D的输出，因此，D试图最大化V（G，D），而G试图最小化V（G，D），从而在方程1中形成极大极小关系。图2显示了GAN的一个图解。
理论上，假设两个模型G和D都具有充足容量，当p_data (x)=p_g (z)，D总是产生1/2时，G和D之间产生平衡，其中p_g (x)表示生成器提供的数据的概率分布[36]。形式上来说，对于固定的G，最佳的判别器D是，它可用不同的方程式1表示。如果我们插入最优D到方程式1中，方程式将转化为p_data (x)和p_g (x)之间的Jensen-Shannon散度（JSD）。因此，最小化JSD 的最优生成器是数据分布pdata（x），通过将最优生成器代入最优D中，D变为1/2。

图2：生成性对抗网络
在GAN的理论支持之外，以上段落引导我们推断两点。首先，从上述的最佳鉴别器，GAN可以连接到密度比技巧中[102]。即数据分布和生成的数据分布之间的密度比，如下：

其中y=0和y=1分别表示生成的数据和真实数据，并假定p（y=1）=p（y=0）。这意味着GAN仅通过使用两个分布的相对行为来解决可能性的难解性[102]，并将此信息传递给生成器以生成真实的样本。其次，GAN可以被解释为测量生成的数据分布与实际数据分布之间的差异，然后学习如何减少这种差异。鉴别器用于隐式测量差异。
尽管GAN具有优势和理论支持，但由于实际问题和无法实现理论上的假设（包括鉴别器的有限容量），许多缺点仍然存在。可以通过改变GAN的目标功能、体系结构等很多尝试来解决这些问题。我们以GAN的基本框架为基础，评估了目标功能的变体以及为GAN的发展提出的体系结构。然后我们将重点讨论GAN的重大失败以及如何解决这些问题。
2.1目标函数
生成模型的目标是匹配来自p_g (x)的真实数据分布p_data (x)。因此最小化两个分布之间的差异是训练生成模型的一个重点。如上所述，标准GAN使用鉴别器估计JSD（pdata | | pg）的最小化。近年来，研究人员发现，可以采用不同的距离或发散量来代替JSD，从而提高GAN的性能。在本节中，我们将讨论如何使用从这些距离导出的各种距离和目标函数来测量p_data (x)和p_g (x)之间的差异。

2生成性对抗网络

f-散度D_f （p_data ||p_g）测量具有特定凸函数f的两个分布之间的方差的方法之一。利用这两个分布的比率，定义了p_data和p_g具有函数f的f-散度如下：

值得注意的是，在f为凸函数且f(1)=0满足的条件下，D_f （p_data ||p_g）可以作为两个分布之间的散度。由于f(1)=0的条件，如果两个分布是等价的，则它们的比值变为1，散度变为0。虽然f通常被称为生成函数[89]，但我们称f为f-发散函数以避免与生成函数G混淆。

关于任意凸函数下的f-散度，f-GAN概括为GAN目标函数。正如我们不完全知道分布，方程3应该通过一个易处理的这种期望形式来估计。利用凸共轭f(u)=〖sup〗_(t∈domf^* ) (tu-f^* (t))。公式3可以重新表述如下：
456
其中f^{*是凸函数f的Fenchel共轭，domf}*表示f^*的域。

方程5是从最大求和大于求和的最大值这一事实出发的，T是一个任意函数类，它满足了X → R。注意，在方程4中用t（x）：x→domf替换t，让方程5中t与∫_χ有关。如果我们用带有a(·) : R→domf^*和Dω（x）：χ→R的T(x)= a(Dω(x))表示t（x），我们可以将T(x)解释为具有特定激活函数a（·）的参数化鉴别器。
然后，我们可以使用参数化生成器Gθ和鉴别器Tω创建具有特定生成器函数f和激活函数a的各种GAN框架。与标准GAN[36]类似，f-GAN首先最大化等式6中关于Tω的下界，使下界紧靠D_f (p_data ||p_g)，然后最小化关于Gθ的近似发散，使p_data (x)和p_g (x)相似。以这种方式，f-GAN试图通过估计给定f的一些发散类型来概括各种GAN目标，如表3所示

表3：使用f-散度的GANs。表转载自[89]。

Kullback-Leibler散度（KLD）、反向KLD、JSD和其他散度可以使用具有特定生成函数f的f-GAN框架获得，尽管它们并不都在表3中表示。在基于f-GAN的GAN中，LSGAN以其简单和高性能成为应用最广泛的GAN之一；我们将在下一段中简要解释LSGAN。综上所述，方程3中的f-散度D_f (p_data ||p_g)可以通过计算其下界的期望值来间接估计，以处理具有未知概率分布的棘手形式。f-GAN在f-散度框架下推广了各种散度，因此它可以根据特定散度得到相应的GAN目标。

[论文节选]无

GAN生成对抗网络论文翻译（二）相关推荐

GAN生成对抗网络论文翻译（一）
给自己一个动力去看英语论文,每天翻译一节,纯属自己翻译,小白一只,如果您能提出建议或者翻译修改,将非常感谢,首先谢谢! How Generative Adversarial Networks and ...
GAN 生成对抗网络论文阅读路线图
路线图按照下面四个准则构建而成: ● 从提纲到细节 ● 从经典到前沿 ● 从通用领域到特定领域 ● 专注于最先进的技术 Generative Adversarial Networks ...
54_pytorch GAN(生成对抗网络)、Gan代码示例、WGAN代码示例
1.54.GAN(生成对抗网络) 1.54.1.什么是GAN 2014 年,Ian Goodfellow 和他在蒙特利尔大学的同事发表了一篇震撼学界的论文.没错,我说的就是<Generative ...
PSGAN——姿态稳健型可感知空间式生成对抗网络论文详细解读与整理
PSGAN--姿态稳健型可感知空间式生成对抗网络论文详细解读与整理 1.摘要 2.什么是PSGAN? 3.主要贡献 4.整体模块 5.目标函数 6.实验结果--部分化妆和插值化妆 7.定量比较 8.参 ...
深度学习(九) GAN 生成对抗网络理论部分
GAN 生成对抗网络理论部分前言一.Pixel RNN 1.图片的生成模型 2.Pixel RNN 3.Pixel CNN 二.VAE(Variational Autoencoder) 1.VA ...
深度学习 GAN生成对抗网络-1010格式数据生成简单案例
一.前言本文不花费大量的篇幅来推导数学公式,而是使用一个非常简单的案例来帮助我们了解GAN生成对抗网络. 二.GAN概念生成对抗网络(Generative Adversarial Networks ...
机器学习-生成对抗网络实战（二-2）
本篇承接上一篇机器学习-生成对抗网络实战(二-1),本篇主要训练生成对抗网络(Generative Adversarial Network)生成器和判别器,大家应当认真理解训练的顺序,以及为什么要这样 ...
使用PyTorch构建GAN生成对抗网络源码（详细步骤讲解+注释版）02 人脸识别下
文章目录 1 测试鉴别器 2 建立生成器 3 测试生成器 4 训练生成器 5 使用生成器 6 内存查看上一节,我们已经建立好了模型所必需的鉴别器类与Dataset类. 使用PyTorch构建GAN生 ...
使用PyTorch构建GAN生成对抗网络源码（详细步骤讲解+注释版）02 人脸识别上
文章目录 1 数据集描述 2 GPU设置 3 设置Dataset类 4 设置辨别器类 5 辅助函数与辅助类 1 数据集描述此项目使用的是著名的celebA(CelebFaces Attribute) ...

GAN生成对抗网络论文翻译（二）

GAN生成对抗网络论文翻译（二）相关推荐

最新文章

热门文章