内容为自己对助教给出代码的自我理解（甚至可以理解为部分翻译。。）外加一些函数的查找以及其功能，欢迎大家指出我的不足，帖子主要是作为自己的笔记记录一下，不喜勿喷。3q

作业内容为建立GAN模型来生成二次元人物faces

Set up the environment

Packages Installation

# You may replace the workspace directory if you want.
workspace_dir = '.'# Training progress bar
!pip install -q qqdm

Download the dataset

这里会给出是个链接，主要是给他们的学生分流下载用的，我们就随便选一个可以的用一下

!gdown --id 1IGrTr308mGAaCKotpkkm8wTKlWs9Jq-p --output "{workspace_dir}/crypko_data.zip"# Other download links
#   Please uncomment the line according to the last digit of your student ID first# 0
# !gdown --id 131zPaVoi-U--XThvzgRfaxrumc3YSBd3 --output "{workspace_dir}/crypko_data.zip"# 1
# !gdown --id 1kCuIj1Pf3T2O94H9bUBxjPBKb---WOmH --output "{workspace_dir}/crypko_data.zip"# 2
# !gdown --id 1boEoiiqBJwoHVvjmI0xgoutE9G0Rv8CD --output "{workspace_dir}/crypko_data.zip"# 3
# !gdown --id 1Ic0mktAQQvnNAnswrPHsg-u2OWGBXTWF --output "{workspace_dir}/crypko_data.zip"# 4
# !gdown --id 1PFcc25r9tLE7OyQ-CDadtysNdWizk6Yg --output "{workspace_dir}/crypko_data.zip"# 5
# !gdown --id 1wgkrYkTrhwDSMdWa5NwpXeE4-7JaUuX2 --output "{workspace_dir}/crypko_data.zip"# 6
# !gdown --id 19gwNYWi9gN9xVL86jC3v8qqNtrXyq5Bf --output "{workspace_dir}/crypko_data.zip"# 7
# !gdown --id 1-KPZB6frRSRLRAtQfafKCVA7em0_NrJG --output "{workspace_dir}/crypko_data.zip"# 8
# !gdown --id 1rNBfmn0YBzXuG5ub7CXbsGwduZqEs8hx --output "{workspace_dir}/crypko_data.zip"# 9
# !gdown --id 113NEISX-2j6rBd1yyBx0c3_9nPIzSNz- --output "{workspace_dir}/crypko_data.zip"


!unzip -q "{workspace_dir}/crypko_data.zip" -d "{workspace_dir}/"

Random seed

Set the random seed to a certain value for reproducibility.

import randomimport torch
import numpy as npdef same_seeds(seed):# Python built-in random modulerandom.seed(seed)# Numpynp.random.seed(seed)# Torchtorch.manual_seed(seed)if torch.cuda.is_available():torch.cuda.manual_seed(seed)torch.cuda.manual_seed_all(seed)torch.backends.cudnn.benchmark = Falsetorch.backends.cudnn.deterministic = Truesame_seeds(2021)

Import Packages

First, we need to import packages that will be used later.

Like hw3, we highly rely on **torchvision**, a library of PyTorch.

import os
import globimport torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms
from torch import optim
from torch.autograd import Variable
from torch.utils.data import Dataset, DataLoader
import matplotlib.pyplot as plt
from qqdm.notebook import qqdm

Dataset

1. Resize the images to (64, 64)-----------注释：原本dataset图片大小128，但是后面用DCGAN的model的input是64，

1. Linearly map the values from [0, 1] to [-1, 1].

Please refer to [PyTorch official website](https://pytorch.org/vision/stable/transforms.html) for details about different transforms.

下面函数段的解读：

glob.glob()函数用法：返回所有匹配的文件路径列表。它只有一个参数pathname，定义了文件路径匹配规则，这里可以是绝对路径，也可以是相对路径。下面是使用glob.glob的例子：

#获取指定目录下的所有图片
print (glob.glob(r"/home/qiaoyunhao/*/*.png"),"\n")#加上r让字符串不转义
transforms.Normalize()函数的用法： 功能：逐channel的对图像进行标准化（均值变为0，标准差变为1），可以加快模型的收敛
output = (input - mean) / std
mean:各通道的均值
std：各通道的标准差
inplace：是否原地操作

参考帖子：(57条消息) 数据归一化处理transforms.Normalize（）_幼稚园的扛把子～的博客-CSDN博客_transforms.normaliz

class CrypkoDataset(Dataset):def __init__(self, fnames, transform):self.transform = transformself.fnames = fnamesself.num_samples = len(self.fnames)def __getitem__(self,idx):fname = self.fnames[idx]# 1. Load the imageimg = torchvision.io.read_image(fname)# 2. Resize and normalize the images using torchvision.img = self.transform(img)return imgdef __len__(self):return self.num_samplesdef get_dataset(root):fnames = glob.glob(os.path.join(root, '*'))# 1. Resize the image to (64, 64)# 2. Linearly map [0, 1] to [-1, 1]compose = [transforms.ToPILImage(),transforms.Resize((64, 64)),transforms.ToTensor(),
#transforms.Normalize将input 的image的rgb的值从【0，1】，Linearly map到[-1,1]transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5)),]
#torchvision.transforms是pytorch中的图像预处理包。一般用Compose把多个步骤整合到一起transform = transforms.Compose(compose)dataset = CrypkoDataset(fnames, transform)return dataset

Show some images

Note that the values are in the range of [-1, 1], we should shift them to the valid range, [0, 1], to display correctly.

dataset = get_dataset(os.path.join(workspace_dir, 'faces'))images = [dataset[i] for i in range(16)]
#make_grid的作用是将若干幅图像拼成一幅图像。具体参数见图1.
grid_img = torchvision.utils.make_grid(images, nrow=4)
plt.figure(figsize=(10,10))
#这里的转置可能是因为rgb2rbg？
plt.imshow(grid_img.permute(1, 2, 0))
plt.show()

图1

由于前面图片被normlize成[-1,1]，作用想要输出正常的图片需要将重新转化成[0,1]

images = [(dataset[i]+1)/2 for i in range(16)]
grid_img = torchvision.utils.make_grid(images, nrow=4)
plt.figure(figsize=(10,10))
plt.imshow(grid_img.permute(1, 2, 0))
plt.show()

Model

Here, we use DCGAN as the model structure. Feel free to modify your own model structure.

Note that the `N` of the input/output shape stands for the batch size.

下文定义函数weights_init()进行权重初始化的方法并不推荐，推荐使用其他的办法，链接如下：

(57条消息) pytorch网络集中权重初始化的方法_听我的错不了的博客-CSDN博客

先看一篇博客：(57条消息) DCGAN模型讲解及避坑指南_fmbao的博客-CSDN博客_dcgan模型https://blog.csdn.net/u011268787/article/details/84926246?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522165209720816782184689289%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=165209720816782184689289&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-4-84926246-null-null.142^v9^pc_search_result_control_group,157^v4^control&utm_term=DCGAN%E7%9A%84Discriminator%283%29%E7%9A%84%E5%8F%82%E6%95%B0&spm=1018.2226.3001.4187

上面这篇可以大致了解DCGAN这个模型的原理。具体注释在下面的代码中

def weights_init(m):classname = m.__class__.__name__if classname.find('Conv') != -1:m.weight.data.normal_(0.0, 0.02)elif classname.find('BatchNorm') != -1:m.weight.data.normal_(1.0, 0.02)m.bias.data.fill_(0)class Generator(nn.Module):"""Input shape: (N, in_dim)Output shape: (N, 3, 64, 64)"""def __init__(self, in_dim, dim=64):super(Generator, self).__init__()def dconv_bn_relu(in_dim, out_dim):return nn.Sequential(
#逆卷积nn.ConvTranspose2d(in_dim, out_dim, 5, 2,padding=2, output_padding=1, bias=False),nn.BatchNorm2d(out_dim),nn.ReLU())self.l1 = nn.Sequential(nn.Linear(in_dim, dim * 8 * 4 * 4, bias=False),nn.BatchNorm1d(dim * 8 * 4 * 4),nn.ReLU())self.l2_5 = nn.Sequential(dconv_bn_relu(dim * 8, dim * 4),dconv_bn_relu(dim * 4, dim * 2),dconv_bn_relu(dim * 2, dim),nn.ConvTranspose2d(dim, 3, 5, 2, padding=2, output_padding=1),nn.Tanh())self.apply(weights_init)def forward(self, x):y = self.l1(x)
#第二个语句执行的是view()函数，这个函数很简单，是一个维度变换函数，我们可以看到out数据变成了四#维数据，第一个是batch_size(通过整个的代码，你就可以明白了),第二个是channel，第三,四是单张图##片的长宽。y = y.view(y.size(0), -1, 4, 4)y = self.l2_5(y)return yclass Discriminator(nn.Module):"""Input shape: (N, 3, 64, 64)Output shape: (N, )"""def __init__(self, in_dim, dim=64):super(Discriminator, self).__init__()def conv_bn_lrelu(in_dim, out_dim):return nn.Sequential(nn.Conv2d(in_dim, out_dim, 5, 2, 2),nn.BatchNorm2d(out_dim),nn.LeakyReLU(0.2),)""" Medium: Remove the last sigmoid layer for WGAN. """self.ls = nn.Sequential(nn.Conv2d(in_dim, dim, 5, 2, 2), nn.LeakyReLU(0.2),conv_bn_lrelu(dim, dim * 2),conv_bn_lrelu(dim * 2, dim * 4),conv_bn_lrelu(dim * 4, dim * 8),nn.Conv2d(dim * 8, 1, 4),
#注释nn.Sigmoid(), )self.apply(weights_init)def forward(self, x):y = self.ls(x)y = y.view(-1)return y

Training

### Initialization

- hyperparameters

- model

- optimizer

- dataloader

# Training hyperparameters
batch_size = 64
z_dim = 100  #输入的维度，可以自己调整
z_sample = Variable(torch.randn(100, z_dim)).cuda()
lr = 1e-4""" Medium: WGAN, 50 epoch, n_critic=5, clip_value=0.01 """
n_epoch = 1 # 50
#n_critic多train几次discriminator，然后再train generator效果更好
n_critic = 1 # 5
# clip_value = 0.01log_dir = os.path.join(workspace_dir, 'logs')
ckpt_dir = os.path.join(workspace_dir, 'checkpoints')
os.makedirs(log_dir, exist_ok=True)
os.makedirs(ckpt_dir, exist_ok=True)# Model
G = Generator(in_dim=z_dim).cuda()
D = Discriminator(3).cuda()
G.train()
D.train()# Loss
criterion = nn.BCELoss()""" Medium: Use RMSprop for WGAN. """
# Optimizer
opt_D = torch.optim.Adam(D.parameters(), lr=lr, betas=(0.5, 0.999))
opt_G = torch.optim.Adam(G.parameters(), lr=lr, betas=(0.5, 0.999))
# opt_D = torch.optim.RMSprop(D.parameters(), lr=lr)
# opt_G = torch.optim.RMSprop(G.parameters(), lr=lr)# DataLoader
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True, num_workers=2)

Training loop

We store some pictures regularly to monitor the current performance of the Generator, and regularly record checkpoints.

下面的代码就比较容易理解，逐行看看就懂了

steps = 0
for e, epoch in enumerate(range(n_epoch)):progress_bar = qqdm(dataloader)for i, data in enumerate(progress_bar):imgs = dataimgs = imgs.cuda()bs = imgs.size(0)print("bs",bs)# ============================================#  Train D# ============================================z = Variable(torch.randn(bs, z_dim)).cuda()r_imgs = Variable(imgs).cuda()f_imgs = G(z)""" Medium: Use WGAN Loss. """# Labelr_label = torch.ones((bs)).cuda()f_label = torch.zeros((bs)).cuda()# Model forwardingr_logit = D(r_imgs.detach())f_logit = D(f_imgs.detach())# Compute the loss for the discriminator.r_loss = criterion(r_logit, r_label)f_loss = criterion(f_logit, f_label)loss_D = (r_loss + f_loss) / 2# WGAN Loss# loss_D = -torch.mean(D(r_imgs)) + torch.mean(D(f_imgs))# Model backwardingD.zero_grad()loss_D.backward()# Update the discriminator.opt_D.step()""" Medium: Clip weights of discriminator. """# for p in D.parameters():#    p.data.clamp_(-clip_value, clip_value)# ============================================#  Train G# ============================================if steps % n_critic == 0:# Generate some fake images.z = Variable(torch.randn(bs, z_dim)).cuda()f_imgs = G(z)# Model forwardingf_logit = D(f_imgs)""" Medium: Use WGAN Loss"""# Compute the loss for the generator.loss_G = criterion(f_logit, r_label)# WGAN Loss# loss_G = -torch.mean(D(f_imgs))# Model backwardingG.zero_grad()loss_G.backward()# Update the generator.opt_G.step()steps += 1# Set the info of the progress bar 设置进度条信息#   Note that the value of the GAN loss is not directly related to#   the quality of the generated images.progress_bar.set_infos({'Loss_D': round(loss_D.item(), 4),'Loss_G': round(loss_G.item(), 4),'Epoch': e+1,'Step': steps,})G.eval()f_imgs_sample = (G(z_sample).data + 1) / 2.0filename = os.path.join(log_dir, f'Epoch_{epoch+1:03d}.jpg')torchvision.utils.save_image(f_imgs_sample, filename, nrow=10)print(f' | Save some samples to {filename}.')# Show generated images in the jupyter notebook.grid_img = torchvision.utils.make_grid(f_imgs_sample.cpu(), nrow=10)plt.figure(figsize=(10,10))plt.imshow(grid_img.permute(1, 2, 0))plt.show()G.train()if (e+1) % 5 == 0 or e == 0:# Save the checkpoints.torch.save(G.state_dict(), os.path.join(ckpt_dir, 'G.pth'))torch.save(D.state_dict(), os.path.join(ckpt_dir, 'D.pth'))

Inference

Use the trained model to generate anime faces!

import torchG = Generator(z_dim)
G.load_state_dict(torch.load(os.path.join(ckpt_dir, 'G.pth')))
G.eval()
G.cuda()

### Generate and show some images.

# Generate 1000 images and make a grid to save them.
n_output = 1000
z_sample = Variable(torch.randn(n_output, z_dim)).cuda()
imgs_sample = (G(z_sample).data + 1) / 2.0
log_dir = os.path.join(workspace_dir, 'logs')
filename = os.path.join(log_dir, 'result.jpg')
torchvision.utils.save_image(imgs_sample, filename, nrow=10)# Show 32 of the images.
grid_img = torchvision.utils.make_grid(imgs_sample[:32].cpu(), nrow=10)
plt.figure(figsize=(10,10))
plt.imshow(grid_img.permute(1, 2, 0))
plt.show()

李宏毅机器学习2021作业6（又名辉夜大小姐与辉夜大筒木的关系）相关推荐

李宏毅机器学习2021作业7-Bert (Question Answering)
内容为自己对助教给出代码的自我理解(甚至可以理解为部分翻译..)外加一些函数的查找以及其功能,欢迎大家指出我的不足,帖子主要是作为自己的笔记记录一下,不喜勿喷.3q Task description ...
【李宏毅机器学习2021】Task01 机器学习介绍
[李宏毅机器学习2021]本系列是针对datawhale<李宏毅机器学习-2022>的学习笔记.本次是对机器学习介绍的学习总结.本节通过学习视频了解到李老师对机器学习课程的整体安排,介绍了 ...
【深度解析→博文总结】李宏毅机器学习2023作业02Classification(Framewise Phoneme Prediction)
文章目录 [系列文章] [简要说明] [视频分享] [作业详情] [调参记录] [Simple Baseline:0.49798] [Medium Baseline:0.66440] [Stong B ...
【李宏毅机器学习2021】Task04 深度学习介绍和反向传播机制
[李宏毅机器学习2021]本系列是针对datawhale<李宏毅机器学习-2022 10月>的学习笔记.本次是对深度学习介绍和反向传播机制的学习总结.本节针对上节课内容,对batch.梯度 ...
李宏毅机器学习课后作业（hw2）
李宏毅机器学习课后作业(hw2) 直接上代码 import numpy as np np.random.seed(0) X_train_fpath = "C:\\Users\\13554\\ ...
⭐李宏毅机器学习2020作业汇总
更新进度:■■■■■□□□□□□□□□□□□□□□|30% 李宏毅机器学习code 序号主题完成情况作业一 Linear Regression ✅ 作业二 Classification ✅ 作业 ...
撒花！李宏毅机器学习 2021 版正式开放上线
提起李宏毅老师,熟悉机器学习的读者朋友一定不会陌生.最典型的就是开局一言不合就"宝可梦".李宏毅老师幽默风趣的教学风格也吸引力很多机器学习爱好者. 李宏毅老师的机器学习课程可以说是 ...
【机器学习基础】撒花！李宏毅机器学习 2021 版正式开放上线
提起李宏毅老师,熟悉机器学习的读者朋友一定不会陌生.最典型的就是开局一言不合就"宝可梦".李宏毅老师幽默风趣的教学风格也吸引力很多机器学习爱好者. 李宏毅老师的机器学习课程可以说是 ...
台大李宏毅机器学习2021
ML 2021 Spring (ntu.edu.tw)https://speech.ee.ntu.edu.tw/~hylee/ml/2021-spring.htmlDiscussion:ML2021S ...

李宏毅机器学习2021作业6（又名辉夜大小姐与辉夜大筒木的关系）