【论文阅读】APdrawing GAN （CVPR19）

论文题目：APdrawing GAN （CVPR19）

风格迁移
人脸照片转换成人脸线稿图
原文地址：https://openaccess.thecvf.com/content_CVPR_2019/html/Yi_APDrawingGAN_Generating_Artistic_Portrait_Drawings_From_Face_Photos_With_Hierarchical_CVPR_2019_paper.html
代码地址：https://github.com/yiranran/APDrawingGAN

难点

线稿难复原
风格抽象
主要特征不能丢失，比如眼睛周围的线
ground truth中的轮廓（画家画的）并不是严格和人脸关键点对齐
线稿和图像的低级特征（轮廓等特征）并不直接关联

模型结构

概述

APDrawing GAN: a noval GAN based architecture that builds upon hierarchical generators and discriminators combining both a global network (for images as a whole) and local networks (for individual facial regions).This allows dedicated drawing strategies to be learned for different facial features.

具体细节

Our model is based on the GAN framework, consist- ing of a generator G and a discriminator D, both of which are CNNs specifically designed for APDrawings with line- stroke-based artist drawing style.
- 模型包含一个生成器和一个判别器。
- 生成器的作用：生成线稿图。
- 判别器的作用：判断线稿图是否真实。
we propose a hierarchical structure for both generator and discriminator, each of which includes a global network and six local networks. The six local networks correspond to the local facial regions of the left eye, right eye, nose, mouth, hair and the background.
- 生成器和判别器都是层级结构。包含一个全局网络和六个局部网络，局部按照图片语义信息分成6个局部。
Furthermore, the generator has an additional fusion network to synthesize the artistic drawings from the output of global and local networks.
- 生成器包含一个额外的融合网络来处理全局网络和局部网络的输出。
综上，生成器共包括G={Gglobal,Gl∗,Gfusion}G = \{G_{global},G_{l∗}, G_{fusion}\}G={Gglobal,Gl∗,Gfusion}, GglobalG_{global}Gglobal is a global generator, Gl∗={Gleyel,Gleyer,Glnose,Glmouth,Glhair,Glbg}G_{l∗} = \{G_{l eye l}, G_{l eye r},G_{l nose},G_{l mouth},G_{l hair},G_{l bg}\}Gl∗={Gleyel,Gleyer,Glnose,Glmouth,Glhair,Glbg} is a set of six local generators, and GfusionG_{fusion}Gfusion is a fusion network.
G用U-Net结构
- A U-Net with skip connections can incorporate multi-scale features and provide sufficient but not excessive flexibility to learn artists’ drawing techniques in APDrawings for different facial regions.
- 全局G用八个 down-convolution和八个up-convolution blocks
- 前4个local G用了三个down-convolution和三个up-convolution blocks
- 后2个local G用了四个down-convolution和四个up-convolution blocks
- G fusion 用 a flat convolution block, two residual blocks and a final convolution layer.
- 脸部局部图片用mtcnn model来获取关键点，然后裁剪出来，背景图用人脸语义分割获取。头发就是剩下的区域
判别器也是包括D={Dglobal,Dl∗}D = \{D_{global},D_{l∗}\}D={Dglobal,Dl∗}, DglobalD_{global}Dglobal is a global discriminator, Dl∗={Dleyel,Dleyer,Dlnose,Dlmouth,Dlhair,Dlbg}D_{l∗} = \{D_{l eye l}, D_{l eye r},D_{l nose},D_{l mouth},D_{l hair},D_{l bg}\}Dl∗={Dleyel,Dleyer,Dlnose,Dlmouth,Dlhair,Dlbg}六个局部判别器
判别器用 Markovian discriminator in Pix2Pix

损失函数

概述

Since artists’ drawings may nothave lines perfectly aligned with image features, we developa novel loss to measure similarity between generated andartists’ drawings based on distance transforms, leading toimproved strokes in portrait drawing. (DT loss)
- 提出一个新的DT loss
In order to best emulate artists, our model separates the GAN’s rendered output into multiple layers, each of which is controlled by separated loss functions.
- GAN的每一层都由一个loss控制

具体细节

Denote the loss function as L(G,D)L(G, D)L(G,D), which is specially designed to include four terms Ladv(G,D)L_{adv}(G,D)Ladv(G,D), LL1(G,D)L_{L_1}(G,D)LL1(G,D), LDT(G,D)L_{DT}(G,D)LDT(G,D) and Llocal(G,D)L_{local}(G,D)Llocal(G,D)
- 包含四个损失函数
  - 对抗损失
  - L1损失
  - DT(distance transform)损失 (本文新提损失函数)
  - 局部损失
- 常规损失函数
  - 对抗损失 Adversarial loss
  - L1损失
- 本文新提损失
  - DT损失
    - 专为线稿风格迁移设计，因为迁移图和ground truth图并不完全对齐
    - we define two DTs of x as images IDT(x)I_{DT} (x)IDT(x) and I′DT(x)I′_{DT} (x)I′DT(x): assuming xˆxˆxˆ is the binarized DT image of x, each pixel in IDT(x)I_{DT} (x)IDT(x) stores the distance value to its nearest black pixel in xˆxˆxˆ and each pixel in I′DT(x)I′_{DT} (x)I′DT(x) stores the distance value to its nearest white pixel in xˆxˆxˆ.
      - 设计了一种distance map，存储一张图片中像素与最近的黑像素或白像素的距离。
    - 本文训练了两个CNN来搜索线稿图中的黑线和白线？
    - 所以两个线稿图之间的chamfer matching distance可以表示为：
    - 因此，DT损失表示为：
  - 局部损失（没什么特别的，就是对应区域算一个L1 loss

数据集

概述

we construct an artistic drawing dataset containinghigh-resolution portrait photos and corresponding profes-sional artistic drawings. AP-Drawing dataset (containing 140 high-resolution face photos and corresponding portrait drawings by a professional artist)
- 线稿图片数据集，包含140张高质量人脸及对应的线稿图

总结

总体结构比较简单，论文里具体结构和细节也写的比较清楚，关键点在于是DT损失的实现和层级生成器和判别器的设计，及其代码的实现。全局和局部的设计其实就是因为全局的gan hold不住这么多细节，所以分了比较多的局部生成器和判别器。
论文价值：代码的具体实现以及比较小众领域的应用。

【论文阅读】APdrawing GAN （CVPR19）相关推荐

论文阅读，GAN 生成对抗网络 2014 Goodfellow原文阅读笔记
2014Generative Adversarial Nets(精读2017.3.2) Goodfellow, Bengio et al. NIPS2014 蒙特利尔大学摘要一种新的生成式框架,同 ...
[论文阅读] (06) 万字详解什么是生成对抗网络GAN？经典论文及案例普及
<娜璋带你读论文>系列主要是督促自己阅读优秀论文及听取学术讲座,并分享给大家,希望您喜欢.由于作者的英文水平和学术能力不高,需要不断提升,所以还请大家批评指正,非常欢迎大家给我留言评论,学 ...
GAN 生成对抗网络论文阅读路线图
路线图按照下面四个准则构建而成: ● 从提纲到细节 ● 从经典到前沿 ● 从通用领域到特定领域 ● 专注于最先进的技术 Generative Adversarial Networks ...
论文阅读——TR-GAN: Topology Ranking GAN with Triplet Loss for Retinal Artery/Vein Classification
论文阅读--TR-GAN: Topology Ranking GAN with Triplet Loss for Retinal Artery/Vein Classification 基于对抗神经网络 ...
Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro 论文阅读
这篇是Liang Zheng团队发表在CVPR2017上的关于利用生成对抗网络产生的无类标样本去提高行人重识别baseline的方法.论文链接:https://arxiv.org/abs/1701.0 ...
[论文阅读] (17)CCS2019 针对PowerShell脚本的轻量级去混淆和语义感知攻击检测（经典）
<娜璋带你读论文>系列主要是督促自己阅读优秀论文及听取学术讲座,并分享给大家,希望您喜欢.由于作者的英文水平和学术能力不高,需要不断提升,所以还请大家批评指正,非常欢迎大家给我留言评论,学 ...
[论文阅读] (11)ACE算法和暗通道先验图像去雾算法（Rizzi | 何恺明老师）
<娜璋带你读论文>系列主要是督促自己阅读优秀论文及听取学术讲座,并分享给大家,希望您喜欢.由于作者的英文水平和学术能力不高,需要不断提升,所以还请大家批评指正,非常欢迎大家给我留言评论,学 ...
StyleGAN-基于样式的生成对抗网络（论文阅读总结）（精）
2 研究背景 NVIDIA在2017年提出的ProGAN解决了生成高分辨率图像(如1024×1024)的问题.ProGAN的关键创新之处在于渐进式训练--从训练分辨率非常低的图像(如4×4)的生成器和 ...
YOLOv4论文阅读（附原文翻译）
YOLOv4论文阅读(附原文翻译) 论文阅读论文翻译 Abstract摘要 1.Introduction 引言 2.Related work相关工作 2.1.Object detection mod ...

【论文阅读】APdrawing GAN （CVPR19）

论文题目：APdrawing GAN （CVPR19）

难点

模型结构

概述

具体细节

损失函数

概述

具体细节

数据集

概述

总结

【论文阅读】APdrawing GAN （CVPR19）相关推荐

最新文章

热门文章