内容为自己对助教给出代码的自我理解(甚至可以理解为部分翻译。。)外加一些函数的查找以及其功能,欢迎大家指出我的不足,帖子主要是作为自己的笔记记录一下,不喜勿喷。3q

作业内容为建立GAN模型来生成二次元人物faces

Set up the environment

Packages Installation

# You may replace the workspace directory if you want.
workspace_dir = '.'# Training progress bar
!pip install -q qqdm

Download the dataset

这里会给出是个链接,主要是给他们的学生分流下载用的,我们就随便选一个可以的用一下

!gdown --id 1IGrTr308mGAaCKotpkkm8wTKlWs9Jq-p --output "{workspace_dir}/crypko_data.zip"# Other download links
#   Please uncomment the line according to the last digit of your student ID first# 0
# !gdown --id 131zPaVoi-U--XThvzgRfaxrumc3YSBd3 --output "{workspace_dir}/crypko_data.zip"# 1
# !gdown --id 1kCuIj1Pf3T2O94H9bUBxjPBKb---WOmH --output "{workspace_dir}/crypko_data.zip"# 2
# !gdown --id 1boEoiiqBJwoHVvjmI0xgoutE9G0Rv8CD --output "{workspace_dir}/crypko_data.zip"# 3
# !gdown --id 1Ic0mktAQQvnNAnswrPHsg-u2OWGBXTWF --output "{workspace_dir}/crypko_data.zip"# 4
# !gdown --id 1PFcc25r9tLE7OyQ-CDadtysNdWizk6Yg --output "{workspace_dir}/crypko_data.zip"# 5
# !gdown --id 1wgkrYkTrhwDSMdWa5NwpXeE4-7JaUuX2 --output "{workspace_dir}/crypko_data.zip"# 6
# !gdown --id 19gwNYWi9gN9xVL86jC3v8qqNtrXyq5Bf --output "{workspace_dir}/crypko_data.zip"# 7
# !gdown --id 1-KPZB6frRSRLRAtQfafKCVA7em0_NrJG --output "{workspace_dir}/crypko_data.zip"# 8
# !gdown --id 1rNBfmn0YBzXuG5ub7CXbsGwduZqEs8hx --output "{workspace_dir}/crypko_data.zip"# 9
# !gdown --id 113NEISX-2j6rBd1yyBx0c3_9nPIzSNz- --output "{workspace_dir}/crypko_data.zip"

!unzip -q "{workspace_dir}/crypko_data.zip" -d "{workspace_dir}/"

Random seed

Set the random seed to a certain value for reproducibility.

import randomimport torch
import numpy as npdef same_seeds(seed):# Python built-in random modulerandom.seed(seed)# Numpynp.random.seed(seed)# Torchtorch.manual_seed(seed)if torch.cuda.is_available():torch.cuda.manual_seed(seed)torch.cuda.manual_seed_all(seed)torch.backends.cudnn.benchmark = Falsetorch.backends.cudnn.deterministic = Truesame_seeds(2021)

Import Packages

First, we need to import packages that will be used later.

Like hw3, we highly rely on **torchvision**, a library of PyTorch.

import os
import globimport torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms
from torch import optim
from torch.autograd import Variable
from torch.utils.data import Dataset, DataLoader
import matplotlib.pyplot as plt
from qqdm.notebook import qqdm

Dataset

1. Resize the images to (64, 64)-----------注释:原本dataset图片大小128,但是后面用DCGAN的model的input是64,

1. Linearly map the values from [0, 1] to  [-1, 1].

Please refer to [PyTorch official website](https://pytorch.org/vision/stable/transforms.html) for details about different transforms.

下面函数段的解读

glob.glob()函数用法:返回所有匹配的文件路径列表。它只有一个参数pathname,定义了文件路径匹配规则,这里可以是绝对路径,也可以是相对路径。下面是使用glob.glob的例子:

#获取指定目录下的所有图片
print (glob.glob(r"/home/qiaoyunhao/*/*.png"),"\n")#加上r让字符串不转义
transforms.Normalize()函数的用法: 功能:逐channel的对图像进行标准化(均值变为0,标准差变为1),可以加快模型的收敛
output = (input - mean) / std
mean:各通道的均值
std:各通道的标准差
inplace:是否原地操作

 参考帖子:(57条消息) 数据归一化处理transforms.Normalize()_幼稚园的扛把子~的博客-CSDN博客_transforms.normaliz

class CrypkoDataset(Dataset):def __init__(self, fnames, transform):self.transform = transformself.fnames = fnamesself.num_samples = len(self.fnames)def __getitem__(self,idx):fname = self.fnames[idx]# 1. Load the imageimg = torchvision.io.read_image(fname)# 2. Resize and normalize the images using torchvision.img = self.transform(img)return imgdef __len__(self):return self.num_samplesdef get_dataset(root):fnames = glob.glob(os.path.join(root, '*'))# 1. Resize the image to (64, 64)# 2. Linearly map [0, 1] to [-1, 1]compose = [transforms.ToPILImage(),transforms.Resize((64, 64)),transforms.ToTensor(),
#transforms.Normalize将input 的image的rgb的值从【0,1】,Linearly map到[-1,1]transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5)),]
#torchvision.transforms是pytorch中的图像预处理包。一般用Compose把多个步骤整合到一起transform = transforms.Compose(compose)dataset = CrypkoDataset(fnames, transform)return dataset

Show some images

Note that the values are in the range of [-1, 1], we should shift them to the valid range, [0, 1], to display correctly.

dataset = get_dataset(os.path.join(workspace_dir, 'faces'))images = [dataset[i] for i in range(16)]
#make_grid的作用是将若干幅图像拼成一幅图像。具体参数见图1.
grid_img = torchvision.utils.make_grid(images, nrow=4)
plt.figure(figsize=(10,10))
#这里的转置可能是因为rgb2rbg?
plt.imshow(grid_img.permute(1, 2, 0))
plt.show()

图1

由于前面图片被normlize成[-1,1],作用想要输出正常的图片需要将重新转化成[0,1]

images = [(dataset[i]+1)/2 for i in range(16)]
grid_img = torchvision.utils.make_grid(images, nrow=4)
plt.figure(figsize=(10,10))
plt.imshow(grid_img.permute(1, 2, 0))
plt.show()

Model

Here, we use DCGAN as the model structure. Feel free to modify your own model structure.

Note that the `N` of the input/output shape stands for the batch size.

下文定义函数weights_init()进行权重初始化的方法并不推荐,推荐使用其他的办法,链接如下:

(57条消息) pytorch网络集中权重初始化的方法_听我的错不了的博客-CSDN博客

先看一篇博客:(57条消息) DCGAN模型讲解及避坑指南_fmbao的博客-CSDN博客_dcgan模型https://blog.csdn.net/u011268787/article/details/84926246?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522165209720816782184689289%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=165209720816782184689289&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-4-84926246-null-null.142^v9^pc_search_result_control_group,157^v4^control&utm_term=DCGAN%E7%9A%84Discriminator%283%29%E7%9A%84%E5%8F%82%E6%95%B0&spm=1018.2226.3001.4187

上面这篇可以大致了解DCGAN这个模型的原理。具体注释在下面的代码中

def weights_init(m):classname = m.__class__.__name__if classname.find('Conv') != -1:m.weight.data.normal_(0.0, 0.02)elif classname.find('BatchNorm') != -1:m.weight.data.normal_(1.0, 0.02)m.bias.data.fill_(0)class Generator(nn.Module):"""Input shape: (N, in_dim)Output shape: (N, 3, 64, 64)"""def __init__(self, in_dim, dim=64):super(Generator, self).__init__()def dconv_bn_relu(in_dim, out_dim):return nn.Sequential(
#逆卷积nn.ConvTranspose2d(in_dim, out_dim, 5, 2,padding=2, output_padding=1, bias=False),nn.BatchNorm2d(out_dim),nn.ReLU())self.l1 = nn.Sequential(nn.Linear(in_dim, dim * 8 * 4 * 4, bias=False),nn.BatchNorm1d(dim * 8 * 4 * 4),nn.ReLU())self.l2_5 = nn.Sequential(dconv_bn_relu(dim * 8, dim * 4),dconv_bn_relu(dim * 4, dim * 2),dconv_bn_relu(dim * 2, dim),nn.ConvTranspose2d(dim, 3, 5, 2, padding=2, output_padding=1),nn.Tanh())self.apply(weights_init)def forward(self, x):y = self.l1(x)
#第二个语句执行的是view()函数,这个函数很简单,是一个维度变换函数,我们可以看到out数据变成了四#维数据,第一个是batch_size(通过整个的代码,你就可以明白了),第二个是channel,第三,四是单张图##片的长宽。y = y.view(y.size(0), -1, 4, 4)y = self.l2_5(y)return yclass Discriminator(nn.Module):"""Input shape: (N, 3, 64, 64)Output shape: (N, )"""def __init__(self, in_dim, dim=64):super(Discriminator, self).__init__()def conv_bn_lrelu(in_dim, out_dim):return nn.Sequential(nn.Conv2d(in_dim, out_dim, 5, 2, 2),nn.BatchNorm2d(out_dim),nn.LeakyReLU(0.2),)""" Medium: Remove the last sigmoid layer for WGAN. """self.ls = nn.Sequential(nn.Conv2d(in_dim, dim, 5, 2, 2), nn.LeakyReLU(0.2),conv_bn_lrelu(dim, dim * 2),conv_bn_lrelu(dim * 2, dim * 4),conv_bn_lrelu(dim * 4, dim * 8),nn.Conv2d(dim * 8, 1, 4),
#注释nn.Sigmoid(), )self.apply(weights_init)def forward(self, x):y = self.ls(x)y = y.view(-1)return y

Training

### Initialization

- hyperparameters

- model

- optimizer

- dataloader

# Training hyperparameters
batch_size = 64
z_dim = 100  #输入的维度,可以自己调整
z_sample = Variable(torch.randn(100, z_dim)).cuda()
lr = 1e-4""" Medium: WGAN, 50 epoch, n_critic=5, clip_value=0.01 """
n_epoch = 1 # 50
#n_critic多train几次discriminator,然后再train generator效果更好
n_critic = 1 # 5
# clip_value = 0.01log_dir = os.path.join(workspace_dir, 'logs')
ckpt_dir = os.path.join(workspace_dir, 'checkpoints')
os.makedirs(log_dir, exist_ok=True)
os.makedirs(ckpt_dir, exist_ok=True)# Model
G = Generator(in_dim=z_dim).cuda()
D = Discriminator(3).cuda()
G.train()
D.train()# Loss
criterion = nn.BCELoss()""" Medium: Use RMSprop for WGAN. """
# Optimizer
opt_D = torch.optim.Adam(D.parameters(), lr=lr, betas=(0.5, 0.999))
opt_G = torch.optim.Adam(G.parameters(), lr=lr, betas=(0.5, 0.999))
# opt_D = torch.optim.RMSprop(D.parameters(), lr=lr)
# opt_G = torch.optim.RMSprop(G.parameters(), lr=lr)# DataLoader
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True, num_workers=2)

Training loop

We store some pictures regularly to monitor the current performance of the Generator, and regularly record checkpoints.

下面的代码就比较容易理解,逐行看看就懂了

steps = 0
for e, epoch in enumerate(range(n_epoch)):progress_bar = qqdm(dataloader)for i, data in enumerate(progress_bar):imgs = dataimgs = imgs.cuda()bs = imgs.size(0)print("bs",bs)# ============================================#  Train D# ============================================z = Variable(torch.randn(bs, z_dim)).cuda()r_imgs = Variable(imgs).cuda()f_imgs = G(z)""" Medium: Use WGAN Loss. """# Labelr_label = torch.ones((bs)).cuda()f_label = torch.zeros((bs)).cuda()# Model forwardingr_logit = D(r_imgs.detach())f_logit = D(f_imgs.detach())# Compute the loss for the discriminator.r_loss = criterion(r_logit, r_label)f_loss = criterion(f_logit, f_label)loss_D = (r_loss + f_loss) / 2# WGAN Loss# loss_D = -torch.mean(D(r_imgs)) + torch.mean(D(f_imgs))# Model backwardingD.zero_grad()loss_D.backward()# Update the discriminator.opt_D.step()""" Medium: Clip weights of discriminator. """# for p in D.parameters():#    p.data.clamp_(-clip_value, clip_value)# ============================================#  Train G# ============================================if steps % n_critic == 0:# Generate some fake images.z = Variable(torch.randn(bs, z_dim)).cuda()f_imgs = G(z)# Model forwardingf_logit = D(f_imgs)""" Medium: Use WGAN Loss"""# Compute the loss for the generator.loss_G = criterion(f_logit, r_label)# WGAN Loss# loss_G = -torch.mean(D(f_imgs))# Model backwardingG.zero_grad()loss_G.backward()# Update the generator.opt_G.step()steps += 1# Set the info of the progress bar 设置进度条信息#   Note that the value of the GAN loss is not directly related to#   the quality of the generated images.progress_bar.set_infos({'Loss_D': round(loss_D.item(), 4),'Loss_G': round(loss_G.item(), 4),'Epoch': e+1,'Step': steps,})G.eval()f_imgs_sample = (G(z_sample).data + 1) / 2.0filename = os.path.join(log_dir, f'Epoch_{epoch+1:03d}.jpg')torchvision.utils.save_image(f_imgs_sample, filename, nrow=10)print(f' | Save some samples to {filename}.')# Show generated images in the jupyter notebook.grid_img = torchvision.utils.make_grid(f_imgs_sample.cpu(), nrow=10)plt.figure(figsize=(10,10))plt.imshow(grid_img.permute(1, 2, 0))plt.show()G.train()if (e+1) % 5 == 0 or e == 0:# Save the checkpoints.torch.save(G.state_dict(), os.path.join(ckpt_dir, 'G.pth'))torch.save(D.state_dict(), os.path.join(ckpt_dir, 'D.pth'))

Inference

Use the trained model to generate anime faces!

import torchG = Generator(z_dim)
G.load_state_dict(torch.load(os.path.join(ckpt_dir, 'G.pth')))
G.eval()
G.cuda()

### Generate and show some images.

# Generate 1000 images and make a grid to save them.
n_output = 1000
z_sample = Variable(torch.randn(n_output, z_dim)).cuda()
imgs_sample = (G(z_sample).data + 1) / 2.0
log_dir = os.path.join(workspace_dir, 'logs')
filename = os.path.join(log_dir, 'result.jpg')
torchvision.utils.save_image(imgs_sample, filename, nrow=10)# Show 32 of the images.
grid_img = torchvision.utils.make_grid(imgs_sample[:32].cpu(), nrow=10)
plt.figure(figsize=(10,10))
plt.imshow(grid_img.permute(1, 2, 0))
plt.show()

李宏毅机器学习2021作业6(又名辉夜大小姐与辉夜大筒木的关系)相关推荐

  1. 李宏毅机器学习2021作业7-Bert (Question Answering)

    内容为自己对助教给出代码的自我理解(甚至可以理解为部分翻译..)外加一些函数的查找以及其功能,欢迎大家指出我的不足,帖子主要是作为自己的笔记记录一下,不喜勿喷.3q Task description ...

  2. 【李宏毅机器学习2021】Task01 机器学习介绍

    [李宏毅机器学习2021]本系列是针对datawhale<李宏毅机器学习-2022>的学习笔记.本次是对机器学习介绍的学习总结.本节通过学习视频了解到李老师对机器学习课程的整体安排,介绍了 ...

  3. 【深度解析→博文总结】李宏毅机器学习2023作业02Classification(Framewise Phoneme Prediction)

    文章目录 [系列文章] [简要说明] [视频分享] [作业详情] [调参记录] [Simple Baseline:0.49798] [Medium Baseline:0.66440] [Stong B ...

  4. 【李宏毅机器学习2021】Task04 深度学习介绍和反向传播机制

    [李宏毅机器学习2021]本系列是针对datawhale<李宏毅机器学习-2022 10月>的学习笔记.本次是对深度学习介绍和反向传播机制的学习总结.本节针对上节课内容,对batch.梯度 ...

  5. 李宏毅机器学习课后作业(hw2)

    李宏毅机器学习课后作业(hw2) 直接上代码 import numpy as np np.random.seed(0) X_train_fpath = "C:\\Users\\13554\\ ...

  6. ⭐李宏毅机器学习2020作业汇总

    更新进度:■■■■■□□□□□□□□□□□□□□□|30% 李宏毅机器学习code 序号 主题 完成情况 作业一 Linear Regression ✅ 作业二 Classification ✅ 作业 ...

  7. 撒花!李宏毅机器学习 2021 版正式开放上线

    提起李宏毅老师,熟悉机器学习的读者朋友一定不会陌生.最典型的就是开局一言不合就"宝可梦".李宏毅老师幽默风趣的教学风格也吸引力很多机器学习爱好者. 李宏毅老师的机器学习课程可以说是 ...

  8. 【机器学习基础】撒花!李宏毅机器学习 2021 版正式开放上线

    提起李宏毅老师,熟悉机器学习的读者朋友一定不会陌生.最典型的就是开局一言不合就"宝可梦".李宏毅老师幽默风趣的教学风格也吸引力很多机器学习爱好者. 李宏毅老师的机器学习课程可以说是 ...

  9. 台大李宏毅机器学习2021

    ML 2021 Spring (ntu.edu.tw)https://speech.ee.ntu.edu.tw/~hylee/ml/2021-spring.htmlDiscussion:ML2021S ...

最新文章

  1. IOS开发错误library not found for -lXXX
  2. 第三方开始菜单软件使微软 Windows 10 升级时崩溃
  3. 深度学习(二十六)——VAE
  4. 大众汽车和鸿蒙,鸿蒙系统下个月即将与大众见面,首发平台并非手机
  5. 2021高校暑假时间汇总!最长70天
  6. python读音有道-python爬虫之有道在线翻译
  7. 食住玩|怎么下载安装VRAY渲染器最新版?
  8. mysql rds 迁移_如何实现迁移RDS for MySQL数据到本地 MySQL
  9. pycharm中快捷键新建文件,pycharm快捷键
  10. 32位CPU最多支持4G内存是怎么算出来的?(解惑篇)
  11. 利用Python爬取前程无忧(51job)上的招聘岗位
  12. 如何利用Slack客户端漏洞窃取Slack用户下载的所有文件
  13. 会议室管理系统jsp和mysql_基于jsp+mysql+servlet的JSP会议-会议室管理系统
  14. 大屏:页面在不同比例屏幕的显示适配与字体随屏幕改变而改变(字体随屏幕分辨率改变自适应的问题)
  15. LVM挂盘的详细操作
  16. unity优化—资源优化
  17. c语言面试题东软,【东软集团程序员Java东软集团C语言面试题】面试问题:东软java面… - 看准网...
  18. 【Jmeter基础篇】03:如何进行post接口压力测试
  19. 论文阅读4:ShiDianNao
  20. Zephyr启动过程与中断响应

热门文章

  1. [附源码]Python计算机毕业设计大学生项目众筹系统Django(程序+LW)
  2. 高频电路基础笔记之丙类高频功放
  3. Swin Transformer详解: Hierarchical Vision Transformer using Shifted Windows
  4. 2. HDFS CLINT WRITE整理版
  5. Clint Eastwood的老爷车
  6. 中文字符长度与英文字符长度的计算
  7. 华中科技大学计算机学院领导,冯丹:华中科技大学计算机学院院长
  8. 【3D建模干货】国外建模大师精心总结,成为建模高手的必备技巧
  9. 物镜PHP,LUCPLFLN20X奥林巴斯物镜
  10. lua处理url编码与解码