PyTorch提供了模块和类torch.nn, torch.optim, Dataset和DataLoader 来创建和训练神经网络。例子使用MNIST数据集训练基本神经网络。

MNIST数据设置

使用经典的MNIST数据集,该数据集由手绘数字的黑白图像组成(0到9之间)。

下载数据集

from pathlib import Path
import requestsDATA_PATH = Path("data")
PATH = DATA_PATH / "mnist"PATH.mkdir(parents=True, exist_ok=True)URL = "https://github.com/pytorch/tutorials/raw/master/_static/"
FILENAME = "mnist.pkl.gz"if not (PATH / FILENAME).exists():content = requests.get(URL + FILENAME).content(PATH / FILENAME).open("wb").write(content)

读取数据

该数据集为numpy数组格式,并已使用pickle(一种用于序列化数据的python特定格式)存储。

import pickle
import gzipwith gzip.open((PATH / FILENAME).as_posix(), "rb") as f:((x_train, y_train), (x_valid, y_valid), _) = pickle.load(f, encoding="latin-1")
x_train.shape
# (50000, 784)

每个图像为28 x 28,存储为长度为784(= 28x28)的扁平行。

显示图片

from matplotlib import pyplot
import numpy as nppyplot.imshow(x_train[0].reshape((28, 28)), cmap="gray")


转换数据格式

PyTorch使用torch.tensor

import torchx_train, y_train, x_valid, y_valid = map(torch.tensor, (x_train, y_train, x_valid, y_valid)
)
n, c = x_train.shape
x_train, x_train.shape, y_train.min(), y_train.max()
print(x_train, y_train)
print(x_train.shape)
print(y_train.min(), y_train.max())

从零开始的神经网络

初始化权重

import mathweights = torch.randn(784, 10) / math.sqrt(784)
weights.requires_grad_()
bias = torch.zeros(10, requires_grad=True)

定义模型

def log_softmax(x):return x - x.exp().sum(-1).log().unsqueeze(-1)def model(xb):return log_softmax(xb @ weights + bias)

@代表点积运算

预测数据

 bs = 64  # batch sizexb = x_train[0:bs]  # a mini-batch from x
preds = model(xb)  # predictions
preds[0], preds.shape
print(preds[0], preds.shape)

计算损失

def nll(input, target):return -input[range(target.shape[0]), target].mean()loss_func = nll

让我们用随机模型来检查损失,以便我们稍后看向后传播后是否可以改善

yb = y_train[0:bs]
print(loss_func(preds, yb))
# tensor(2.3088, grad_fn=<NegBackward>)

计算准确率

def accuracy(out, yb):preds = torch.argmax(out, dim=1)return (preds == yb).float().mean()
print(accuracy(preds, yb))
# tensor(0.0781)

由于我们从随机权重开始,因此在这一阶段,我们的预测不会比随机预测更好。

训练循环

对于每次迭代,我们将:

  • 选择一个小批量的数据(大小为bs)
  • 使用模型进行预测
  • 计算损失
  • loss.backward() 在这种情况下,将更新模型的渐变,weights 并bias。
from IPython.core.debugger import set_tracelr = 0.5  # learning rate
epochs = 2  # how many epochs to train forfor epoch in range(epochs):for i in range((n - 1) // bs + 1):#         set_trace()start_i = i * bsend_i = start_i + bsxb = x_train[start_i:end_i]yb = y_train[start_i:end_i]pred = model(xb)loss = loss_func(pred, yb)loss.backward()with torch.no_grad():weights -= weights.grad * lrbias -= bias.grad * lrweights.grad.zero_()bias.grad.zero_()

让我们检查损失和准确性,并将其与我们之前获得的进行比较。我们希望损失会减少,准确性会增加,而且确实如此。

print(loss_func(model(xb), yb), accuracy(model(xb), yb))

tensor(0.0817, grad_fn=) tensor(1.)

使用nn.Module进行重构

from torch import nn
import torch.nn.functional as F
class Mnist_Logistic(nn.Module):def __init__(self):super().__init__()self.weights = nn.Parameter(torch.randn(784, 10) / math.sqrt(784))self.bias = nn.Parameter(torch.zeros(10))def forward(self, xb):return xb @ self.weights + self.bias
model = Mnist_Logistic()
loss_func = F.cross_entropy
print(loss_func(model(xb), yb))

以前,在我们的训练循环中,必须按名称更新每个参数的值,并手动将每个参数的梯度分别归零。
现在,我们可以利用model.parameters()和model.zero_grad()来使这些步骤更简洁。

def fit():for epoch in range(epochs):for i in range((n - 1) // bs + 1):start_i = i * bsend_i = start_i + bsxb = x_train[start_i:end_i]yb = y_train[start_i:end_i]pred = model(xb)loss = loss_func(pred, yb)loss.backward()with torch.no_grad():for p in model.parameters():p -= p.grad * lrmodel.zero_grad()fit()

让我们仔细检查一下我们的损失是否减少了:

print(loss_func(model(xb), yb))

tensor(0.0827, grad_fn=< NllLossBackward>)

使用nn.Linear重构

使用Pytorch类nn.Linear作为线性层,而不是手动定义和初始化self.weights以及进行self.bias计算。
Pytorch还提供了一个包含各种优化算法的包torch.optim。我们可以使用step优化器中的方法优化,而不是手动更新每个参数。

class Mnist_Logistic(nn.Module):def __init__(self):super().__init__()self.lin = nn.Linear(784, 10)def forward(self, xb):return self.lin(xb)
def get_model():model = Mnist_Logistic()return model, optim.SGD(model.parameters(), lr=lr)model, opt = get_model()
print(loss_func(model(xb), yb))for epoch in range(epochs):for i in range((n - 1) // bs + 1):start_i = i * bsend_i = start_i + bsxb = x_train[start_i:end_i]yb = y_train[start_i:end_i]pred = model(xb)loss = loss_func(pred, yb)loss.backward()opt.step()opt.zero_grad()print(loss_func(model(xb), yb))

使用DataLoader进行重构

DataLoader会自动为我们提供每个小批量。

from torch.utils.data import TensorDataset
from torch.utils.data import DataLoadertrain_ds = TensorDataset(x_train, y_train)
train_dl = DataLoader(train_ds, batch_size=bs)
model, opt = get_model()for epoch in range(epochs):for xb, yb in train_dl:pred = model(xb)loss = loss_func(pred, yb)loss.backward()opt.step()opt.zero_grad()print(loss_func(model(xb), yb))

添加验证

我们只是试图建立一个合理的训练循环以用于我们的训练数据。实际上,您始终还应该具有一个验证集,以识别是否过度拟合。

train_ds = TensorDataset(x_train, y_train)
train_dl = DataLoader(train_ds, batch_size=bs, shuffle=True)valid_ds = TensorDataset(x_valid, y_valid)
valid_dl = DataLoader(valid_ds, batch_size=bs * 2)model, opt = get_model()for epoch in range(epochs):model.train()for xb, yb in train_dl:pred = model(xb)loss = loss_func(pred, yb)loss.backward()opt.step()opt.zero_grad()model.eval()with torch.no_grad():valid_loss = sum(loss_func(model(xb), yb) for xb, yb in valid_dl)print(epoch, valid_loss / len(valid_dl))

创建fit()和get_data()

我们经历了两次相似的过程来计算训练集和验证集的损失。因此要设计自己的函数loss_batch来计算损失。
我们将优化器传入训练集中,然后使用它执行反向传播。对于验证集,我们没有通过优化程序,因此该方法不会执行反向传播。
fit 来训练我们的模型,并计算每个时期的训练和验证损失
get_data 返回训练和验证集的数据加载器

def loss_batch(model, loss_func, xb, yb, opt=None):loss = loss_func(model(xb), yb)if opt is not None:loss.backward()opt.step()opt.zero_grad()return loss.item(), len(xb)import numpy as npdef fit(epochs, model, loss_func, opt, train_dl, valid_dl):for epoch in range(epochs):model.train()for xb, yb in train_dl:loss_batch(model, loss_func, xb, yb, opt)model.eval()with torch.no_grad():losses, nums = zip(*[loss_batch(model, loss_func, xb, yb) for xb, yb in valid_dl])val_loss = np.sum(np.multiply(losses, nums)) / np.sum(nums)print(epoch, val_loss)
def get_data(train_ds, valid_ds, bs):return (DataLoader(train_ds, batch_size=bs, shuffle=True),DataLoader(valid_ds, batch_size=bs * 2),)# 现在,我们获取数据加载器和拟合模型的整个过程可以在3行代码中运行train_dl, valid_dl = get_data(train_ds, valid_ds, bs)
model, opt = get_model()
fit(epochs, model, loss_func, opt, train_dl, valid_dl)

CNN

class Mnist_CNN(nn.Module):def __init__(self):super().__init__()self.conv1 = nn.Conv2d(1, 16, kernel_size=3, stride=2, padding=1)self.conv2 = nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1)self.conv3 = nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1)def forward(self, xb):xb = xb.view(-1, 1, 28, 28)xb = F.relu(self.conv1(xb))xb = F.relu(self.conv2(xb))xb = F.relu(self.conv3(xb))xb = F.avg_pool2d(xb, 4)return xb.view(-1, xb.size(1))lr = 0.1
model = Mnist_CNN()
# Momentum  是随机梯度下降的一种变体,它也考虑了以前的更新,通常可以加快训练速度
opt = optim.SGD(model.parameters(), lr=lr, momentum=0.9)fit(epochs, model, loss_func, opt, train_dl, valid_dl)

Sequential

torch.nn还有另一个方便的类,可以用来简化我们的代码: Sequential
Sequential对象以顺序的方式运行每个包含在其内的模块。

class Lambda(nn.Module):def __init__(self, func):super().__init__()self.func = funcdef forward(self, x):return self.func(x)def preprocess(x):return x.view(-1, 1, 28, 28)
model = nn.Sequential(Lambda(preprocess),nn.Conv2d(1, 16, kernel_size=3, stride=2, padding=1),nn.ReLU(),nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1),nn.ReLU(),nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1),nn.ReLU(),nn.AvgPool2d(4),Lambda(lambda x: x.view(x.size(0), -1)),
)opt = optim.SGD(model.parameters(), lr=lr, momentum=0.9)fit(epochs, model, loss_func, opt, train_dl, valid_dl)

包装DataLoader

我们的CNN非常简洁,但仅适用于MNIST,因为:
假设输入为28 * 28长向量
假设CNN的最终网格大小为4 * 4
首先,我们可以删除初始的Lambda层,但将数据预处理移至生成器中:

def preprocess(x, y):return x.view(-1, 1, 28, 28), yclass WrappedDataLoader:def __init__(self, dl, func):self.dl = dlself.func = funcdef __len__(self):return len(self.dl)def __iter__(self):batches = iter(self.dl)for b in batches:yield (self.func(*b))train_dl, valid_dl = get_data(train_ds, valid_ds, bs)
train_dl = WrappedDataLoader(train_dl, preprocess)
valid_dl = WrappedDataLoader(valid_dl, preprocess)

替换nn.AvgPool2d为nn.AdaptiveAvgPool2d,这允许我们定义所需的输出张量的大小

model = nn.Sequential(nn.Conv2d(1, 16, kernel_size=3, stride=2, padding=1),nn.ReLU(),nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1),nn.ReLU(),nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1),nn.ReLU(),nn.AdaptiveAvgPool2d(1),Lambda(lambda x: x.view(x.size(0), -1)),
)opt = optim.SGD(model.parameters(), lr=lr, momentum=0.9)
fit(epochs, model, loss_func, opt, train_dl, valid_dl)

使用GPU

首先检查GPU是否在Pytorch中正常工作

print(torch.cuda.is_available())

创建一个设备对象

dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

将更新 preprocess 移至GPU

def preprocess(x, y):return x.view(-1, 1, 28, 28).to(dev), y.to(dev)
train_dl, valid_dl = get_data(train_ds, valid_ds, bs)
train_dl = WrappedDataLoader(train_dl, preprocess)
valid_dl = WrappedDataLoader(valid_dl, preprocess)

最后,我们可以将模型移至GPU。

model.to(dev)
opt = optim.SGD(model.parameters(), lr=lr, momentum=0.9)

是否运行的更快?

fit(epochs, model, loss_func, opt, train_dl, valid_dl)

pytorch torch.nn到底是什么?相关推荐

  1. pytorch TORCH.NN 到底是什么?

    PyTorch 提供了设计精美的模块和类torch.nn. torch.optim. Dataset和DataLoader 来帮助创建和训练神经网络.为了充分利用它们的力量并针对需求灵活的定制它们,需 ...

  2. PyTorch : torch.nn.xxx 和 torch.nn.functional.xxx

    PyTorch : torch.nn.xxx 和 torch.nn.functional.xxx 在写 PyTorch 代码时,我们会发现在 torch.nn.xxx 和 torch.nn.funct ...

  3. PyTorch 1.0 中文官方教程:torch.nn 到底是什么?

    译者:lhc741 作者:Jeremy Howard,fast.ai.感谢Rachel Thomas和Francisco Ingham的帮助和支持. 我们推荐使用notebook来运行这个教程,而不是 ...

  4. pytorch torch.nn.MSELoss

    应用 # 1.计算绝对差总和:|0-1|^2+|1-1|^2+|2-1|^2+|3-1|^2=6 # 2.求平均: 6/4 =1.5 import torch import torch.nn as n ...

  5. pytorch torch.nn.Module.register_buffer

    API register_buffer(name: str, tensor: Optional[torch.Tensor], persistent: bool = True) → None 注册buf ...

  6. pytorch torch.nn.TransformerEncoderLayer

    API CLASS torch.nn.TransformerEncoderLayer(d_model, nhead, dim_feedforward=2048, dropout=0.1, activa ...

  7. pytorch torch.nn.TransformerEncoder

    API CLASS torch.nn.TransformerEncoder(encoder_layer, num_layers, norm=None) TransformerEncoder is a ...

  8. pytorch torch.nn.Embedding

    词嵌入矩阵,可以加载使用word2vector,glove API CLASS torch.nn.Embedding(num_embeddings: int, embedding_dim: int, ...

  9. [Pytorch]torch.nn.functional.conv2d与深度可分离卷积和标准卷积

    torch.nn.functional.conv2d与深度可分离卷积和标准卷积 前言 F.conv2d与nn.Conv2d F.conv2d 标准卷积考虑Batch的影响 深度可分离卷积 深度可分离卷 ...

  10. pytorch torch.nn.MSELoss(size_average=True)(均方误差【损失函数】)Mean Squared Error(MSE)、SSE(和方差)

    class torch.nn.MSELoss(size_average=True)[source] 创建一个衡量输入x(模型预测输出)和目标y之间均方误差标准. x 和 y 可以是任意形状,每个包含n ...

最新文章

  1. 3 种场景 @Transactional 失效的解决方法
  2. 坐标下降+随机梯度下降
  3. linux磁盘混乱,Linux磁盘设备文件混乱源于Linux内核自身
  4. 正则表达式 (re包)——python(快餐)
  5. Javaweb经典三层架构的演变
  6. 如何实现RTMP推送Android Camera2数据
  7. 蓝桥杯入门训练Fibonacci数列
  8. flink 三种时间机制_Flink时间系列:Event Time下如何处理迟到数据
  9. 中国内窥镜仪器固定臂市场趋势报告、技术动态创新及市场预测
  10. 信通院发布《5G经济社会影响白皮书》:10年内带动16.9万亿总产出
  11. PHP截取文件,[转载]php做截取文件后缀名大全
  12. python列表冒泡排序方法_python列表去重 冒泡排序 插序排序
  13. BurpSuite-Target使用
  14. Java之坦克大战(二)---坦克图形绘制
  15. JS 时间函数实现9宫格抽奖
  16. 很多情侣看了后,晚上再不关机了!
  17. Firefox 59 正式发布:改进页面加载速度和屏幕截图工具
  18. html语义化标签和无语义化标签
  19. 查看电脑系统是否永久激活
  20. linux parted 方式挂盘,支持大于4T盘扩容

热门文章

  1. 2021-08-18-ideal配置github时报:insufficient scopes granted to the token
  2. uk码对照表_鞋码对照表_UK鞋码对照表
  3. [LuoGu] P3957 跳房子
  4. URL Schemer
  5. 快速了解idm+油猴插件配合,极速下载(适用于全网)
  6. win10安装oracle数据库失败
  7. 不知道华为手机识别图片文字怎么弄?2个识别方法收好了
  8. idb的安装log及解决办法
  9. winedit 永久试用的办法
  10. 应该怎么学大数据?该从哪学起?