阅读CariFaceParsing(未完待续。。。。)

github

1. 数据集介绍

1.1 Helen_images

我们使用到的helen数据集从dataset链接中下载得到，（注意：我们下载的是Resized过的数据版本——341.7M）。

在images文件夹下是我们会使用到的图片文件，此时下载下来的文件中的labels文件夹里存放着每张image对应的11个parsing图片，每一张parsing代表人脸的组件，如左眼、右眼、头发等等。但是我们在CariFaceParsing试验中并不是用到这个parsing文件。

1.2 helen_parsing

import numpy as np
import os
import argparse
from PIL import Image
import cv2
import torchvision.transforms as transformsif __name__ == '__main__':parser = argparse.ArgumentParser(description='PyTorch ')parser.add_argument('--dataset_path', type=str, default='../datasets/helen/labels_11')opt = parser.parse_args()osize = [286, 286]transform = transforms.Compose([transforms.Resize(osize, Image.BICUBIC),transforms.RandomCrop(256)])namelist = os.listdir(opt.dataset_path)for name in namelist:parsing_list = os.listdir(os.path.join(opt.dataset_path, name))flag = 0parsing_list.sort()index = 0image = Image.open('../datasets/helen/images/' + name + '.jpg').convert('RGB')im = image.resize((256, 256), Image.BILINEAR)im = np.array(im)vis_im = im.copy().astype(np.uint8)for parsingname in parsing_list:img = np.array(Image.open(os.path.join(opt.dataset_path, name, parsingname)).resize((256, 256), Image.BICUBIC))[np.newaxis, :, :]img = np.where(img > 20, index, 0)index += 1if flag == 0:flag = 1imgs = imgelse:imgs = np.concatenate((imgs, img), axis=0)parsing = np.argmax(imgs, axis=0)save_path = '../datasets/helen/labels/%s' %(name)print(name, np.unique(parsing))part_colors = [ [255, 85, 255], [255, 85, 0], [255, 170, 0],[255, 0, 85], [255, 0, 170],[0, 255, 0], [85, 255, 0], [170, 255, 0],[0, 255, 85], [0, 255, 170],[0, 0, 255], [85, 0, 255], [170, 0, 255],[0, 85, 255], [0, 170, 255],[255, 255, 0], [255, 255, 85], [255, 255, 170],[255, 0, 255], [255, 85, 255], [255, 170, 255],[0, 255, 255], [85, 255, 255], [170, 255, 255]]parsing_anno = parsingvis_parsing_anno = parsing_anno.copy().astype(np.uint8)vis_parsing_anno = cv2.resize(vis_parsing_anno, None, fx=1, fy=1, interpolation=cv2.INTER_NEAREST)vis_parsing_anno_color = np.zeros((vis_parsing_anno.shape[0], vis_parsing_anno.shape[1], 3)) + 255num_of_class = np.max(vis_parsing_anno)for pi in np.unique(parsing):index = np.where(vis_parsing_anno == pi)vis_parsing_anno_color[index[0], index[1], :] = part_colors[pi]vis_parsing_anno_color = vis_parsing_anno_color.astype(np.uint8)vis_im = cv2.addWeighted(cv2.cvtColor(vis_im, cv2.COLOR_RGB2BGR), 0.4, vis_parsing_anno_color, 0.6, 0)cv2.imwrite(save_path + '.png', vis_parsing_anno)# cv2.imwrite(save_path+'.jpg', vis_im, [int(cv2.IMWRITE_JPEG_QUALITY), 100])  # 保存附上颜色的图片print('end')

注：保存下来的png文件为只有1-11 label数字的图片,而jpg图片为在每一个脸部部位附上相应颜色后的图片。在stap two 中我们使用的是png图片。

1.3 helen_landmarks

download

该链接是在CariFaceParsing的github作者提供的，下载下来后我们能在下图路径下找到我们需要用到的helen数据集的landmarks文件。

1.3 WebCaricature

对于论文中使用到的WebCaricature数据集也并不是直接使用的，在download中下载下来的文件的路径CariFaceParsing_data\adaptation\datasets\landmark_webcaricature中我们可以发现作者将WebCaricature部分图片根据聚类中心划分了A-I这9类。

由于在作者提供的链接里没有提供根据聚类好的分类的图片数据，所以我们通过提供的landmark_webcaricature里面的文件名称读取出相应的图片文件。并将读取出来的图片存放的文件夹命名为face_webcaricature.

import os
import argparse
import shutilif __name__ == '__main__':parser = argparse.ArgumentParser(description='PyTorch ')parser.add_argument('--landmark_path', type=str, default='../datasets/landmark_webcaricature')parser.add_argument('--webcaricature_dataset', type=str, default='/home/lingna/workspaces/experiments_on_face_photo2drawing-master/datasets/WebCaricature/frontalization_dataset_v005/OriginalImages')opt = parser.parse_args()path_list = os.listdir(opt.landmark_path)for path in path_list:if os.path.isdir(os.path.join(opt.landmark_path, path)):names = os.listdir(os.path.join(opt.landmark_path, path))for name in names:p_or_c_name, id = name.split('_')[0], name.split('_')[1].replace('npy', 'jpg')original_path = os.path.join(opt.webcaricature_dataset, p_or_c_name, id)target_path = os.path.join('../datasets/face_webcaricature', path)if not os.path.exists(target_path):os.makedirs(target_path)shutil.copy(original_path, target_path)os.rename(target_path+'/'+id, target_path+'/'+p_or_c_name+'_'+id)

2. CariFaceParsing实验部分

2.1 框架结构

简单介绍一下论文所要做的工作，在目前的很多工作都是针对真实人脸照片来生成人脸parsing图，但却有很少可以通过一张漫画图去生成漫画的parsing图。所以本文所做的任务就是去通过Shape Adaptation 和 Texture Adaptation 这两个模块去生成漫画和漫画parsing，这样便可以通过漫画和漫画的parsing图去训练我们的face parsing Network。

下面介绍上图框架中标注的文字分别所代表的含义。①landmarks Maps、Ground Truth分别是一张Input Photo 的landmark和parsing图，其中landmark Maps是one-hot形式的，我们把不同的ground truth landmarks连接成线或区域，如嘴，鼻子。然后我们在landmark maps中把每条连接的线或区域看作一个单一的通道，表示照片的面部布局。②Shape Selector:为了更好地编码漫画形状夸张的多样性，我们利用一个one-hot vector作为conditional input，以生成不同的形状。那么它是怎么得到的呢？其实在训练之前，我们首先利用landmark positions对caricature 数据集进行聚类，并利用聚类中心作为形状集。我们从上面1.2节可以发现作者将caricature数据集聚类成了9类（A-I），所以也就是说Shape Selector中的每一位代表不同的漫画聚类。（style Selector和此相类似，只是style selector是根据漫画图片风格聚类得到的不同的漫画风格分类）③Synthetic Ground Truth：表示parsing图的ground truth图经过形变得到的图片。④shape Adaptation 和textture adaptation 分别用来做学习变形参数，以及风格迁移的。我们将得到的形变后的漫画和形变后的parsing图去训练FaceParsingNetwork。

Shape Adaptation：

本论文所提出的Shape Adaptation模块主要是受到Spatial Transformer Networks（STN）的启发。我们的方法还包含可微图像扭曲操作（differentiable image warping operation ），以对照片执行合理的形状夸张以捕获形状域偏移。由于没有配对的训练数据可用，所以我们以cycleGAN的方式训练Shape Adaptation模块，其中STN被插入到图像转换网络中。

Shape Adaptation预测在照片上的空间扭曲的参数以生成变形的参数。根据这种变形方式，我们将ground truth labels也做相应的形变。

Texture Adaptation：

在shape Adaptation之后，我们可以得到形变后的人脸照片和形变后的parsing图，在texture中我们主要是为了将形变后的人脸照片进行漫画的风格迁移。在该结构中我们使用conditional information（style Selector）去得到多风格的texture generation。（该部分模型的输入为形变后的人脸照片和style Selector。）

在训练这个模型的时候，我们使用的是content loss和style loss[1]。对于content loss 我们将生成的漫画和形变的照片用VGG模型提取出Conv4-2层的特征。目的是加强生成的漫画，并保存变形照片的结构。对于style loss我们用Batch Normalization (BN)的统计量代替Gram matrices的统计量来计算特征。然后，以生成的漫画与参考漫画图像的特征差异为特征，得出style loss。

2.2 训练shape adapatation模型

2.2.1 step one

Our method has two parts, adaptation and segmentation. For adaptation, go to the adaptation directory. Please put the Webcaricature dataset to “CariFaceParsing/adaptation/datasets/face_webcaricature”. And link “trainA” and “val” to “photo”, link “trainB”, “trainC”, “trainD”, “trainE”, “trainF”, “trainG”, “trainH”, “trainI” to “caricature”. And download the provided “landmark_webcaricature” and put it to “CariFaceParsing/adaptation/datasets/”

python train.py --dataroot ./datasets/landmark_webcaricature/ --name shape_adaptation --model shape_adaptation --input_nc 20 --output_nc 11 --dataset_mode star --batch_size 8

注意：直接使用该行命令去运行作者代码的时候要注意图片的大小，因为默认的–resize_or_crop参数为scale_width，所以当数据集图片大小不一的时候便会报错。所以解决办法一个是让输入图片大小保持一致，或者将–resize_or_crop参数scale_width修改为resize_and_crop。

该部分实验的含义作用：通过cyclegan的方式学习训练shape adaptation模型。

2.2.2 step two

Then put Helen dataset to “CariFaceParsing/adaptation/datasets/helen” and it should have three subfolder, “images”, “labels”, “landmark”.

generate_shape_adaptation.py --dataroot ./datasets/helen/landmark/ --name shape_adaptation --model shape_adaptation --input_nc 20 --output_nc 11 --dataset_mode helen

此时我们会在helen_shape_adaptation文件夹中得到变形后的helen图片，以及变形后的helen的parsing图片，并且每一张helen图在经过这个代码运行之后会得到B-I这几种的变形方式。

2.2.3 step three

[1]Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, and Ming-Hsuan Yang, “Diversified texture synthesis with feed-forward networks,” in CVPR, 2017. 3