**

情感分析

**

学习目标

学习和训练文本分类模型
学习torchtext的基本使用方法
BucketIterator
学习torch.nn的一些基本模型
Conv2d

在这份notebook中,我们会用PyTorch模型和TorchText再来做情感分析(检测一段文字的情感是正面的还是负面的)。我们会使用IMDb 数据集,即电影评论。

模型从简单到复杂,我们会依次构建:

Word Averaging模型
RNN/LSTM模型
CNN模型
**

准备数据

**
TorchText中的一个重要概念是Field。Field决定了你的数据会被怎样处理。在我们的情感分类任务中,我们所需要接触到的数据有文本字符串和两种情感,“pos"或者"neg”。
Field的参数制定了数据会被怎样处理。
我们使用TEXT field来定义如何处理电影评论,使用LABEL field来处理两个情感类别。
我们的TEXT field带有tokenize=‘spacy’,这表示我们会用spaCy tokenizer来tokenize英文句子。如果我们不特别声明tokenize这个参数,那么默认的分词方法是使用空格。
安装spaCy
pip install -U spacy
python -m spacy download en
LABEL由LabelField定义。这是一种特别的用来处理label的Field。我们后面会解释dtype。
更多关于Fields,参见https://github.com/pytorch/text/blob/master/torchtext/data/field.py
和之前一样,我们会设定random seeds使实验可以复现。

import torch
from torchtext import dataSEED = 1234torch.manual_seed(SEED)
torch.cuda.manual_seed(SEED)
torch.backends.cudnn.deterministic = TrueTEXT = data.Field(tokenize='spacy')
LABEL = data.LabelField(dtype=torch.float)

TorchText支持很多常见的自然语言处理数据集。
下面的代码会自动下载IMDb数据集,然后分成train/test两个torchtext.datasets类别。数据被前面的Fields处理。IMDb数据集一共有50000电影评论,每个评论都被标注为正面的或负面的。

from torchtext import datasets
train_data, test_data = datasets.IMDB.splits(TEXT, LABEL)

查看每个数据split有多少条数据。

print(f'Number of training examples: {len(train_data)}')
print(f'Number of testing examples: {len(test_data)}')
Number of training examples: 25000
Number of testing examples: 25000

查看一个example。

print(vars(train_data.examples[0]))
{'text': ['Brilliant', 'adaptation', 'of', 'the', 'novel', 'that', 'made', 'famous', 'the', 'relatives', 'of', 'Chilean', 'President', 'Salvador', 'Allende', 'killed', '.', 'In', 'the', 'environment', 'of', 'a', 'large', 'estate', 'that', 'arises', 'from', 'the', 'ruins', ',', 'becoming', 'a', 'force', 'to', 'abuse', 'and', 'exploitation', 'of', 'outrage', ',', 'a', 'luxury', 'estate', 'for', 'the', 'benefit', 'of', 'the', 'upstart', 'Esteban', 'Trueba', 'and', 'his', 'undeserved', 'family', ',', 'the', 'brilliant', 'Danish', 'director', 'Bille', 'August', 'recreates', ',', 'in', 'micro', ',', 'which', 'at', 'the', 'time', 'would', 'be', 'the', 'process', 'leading', 'to', 'the', 'greatest', 'infamy', 'of', 'his', 'story', 'to', 'the', 'hardened', 'Chilean', 'nation', ',', 'and', 'whose', 'main', 'character', 'would', 'Augusto', 'Pinochet', '(', 'Stephen', 'similarities', 'with', 'it', 'are', 'inevitable', ':', 'recall', ',', 'as', 'an', 'example', ',', 'that', 'image', 'of', 'the', 'senator', 'with', 'dark', 'glasses', 'that', 'makes', 'him', 'the', 'wink', 'to', 'the', 'general', 'to', 'begin', 'making', 'the', 'palace).<br', '/><br', '/>Bille', 'August', 'attends', 'an', 'exceptional', 'cast', 'in', 'the', 'Jeremy', 'protruding', 'Irons', ',', 'whose', 'character', 'changes', 'from', 'arrogance', 'and', 'extreme', 'cruelty', ',', 'the', 'hard', 'lesson', 'that', 'life', 'always', 'brings', 'us', 'to', 'almost', 'force', 'us', 'to', 'change', '.', 'In', 'Esteban', 'fully', 'applies', 'the', 'law', 'of', 'resonance', ',', 'with', 'great', 'wisdom', ',', 'Solomon', 'describes', 'in', 'these', 'words:"The', 'things', 'that', 'freckles', 'are', 'the', 'same', 'punishment', 'that', 'will', 'serve', 'you', '.', '"', '<', 'br', '/><br', '/>Unforgettable', 'Glenn', 'Close', 'playing', 'splint', ',', 'the', 'tainted', 'sister', 'of', 'Stephen', ',', 'whose', 'sin', ',', 'driven', 'by', 'loneliness', ',', 'spiritual', 'and', 'platonic', 'love', 'was', 'the', 'wife', 'of', 'his', 'cruel', 'snowy', 'brother', '.', 'Meryl', 'Streep', 'also', 'brilliant', ',', 'a', 'woman', 'whose', 'name', 'came', 'to', 'him', 'like', 'a', 'glove', 'Clara', '.', 'With', 'telekinetic', 'powers', ',', 'cognitive', 'and', 'mediumistic', ',', 'this', 'hardened', 'woman', ',', 'loyal', 'to', 'his', 'blunt', ',', 'conservative', 'husband', ',', 'is', 'an', 'indicator', 'of', 'character', 'and', 'self', '-', 'control', 'that', 'we', 'wish', 'for', 'ourselves', 'and', 'for', 'all', 'human', 'beings', '.', '<', 'br', '/><br', '/>Every', 'character', 'is', 'a', 'portrait', 'of', 'virtuosity', '(', 'as', 'Blanca', 'worthy', 'rebel', 'leader', 'Pedro', 'Segundo', 'unhappy', '...', ')', 'or', 'a', 'portrait', 'of', 'humiliation', ',', 'like', 'Stephen', 'Jr.', ',', 'the', 'bastard', 'child', 'of', 'Senator', ',', 'who', 'serves', 'as', 'an', 'instrument', 'for', 'the', 'return', 'of', 'the', 'boomerang', '.', '<', 'br', '/><br', '/>The', 'film', 'moves', 'the', 'bowels', ',', 'we', 'recreated', 'some', 'facts', 'that', 'should', 'not', 'ever', 'be', 'repeated', ',', 'but', 'that', 'absurdly', 'still', 'happen', '(', 'Colombia', 'is', 'a', 'sad', 'example', ')', 'and', 'another', 'reminder', 'that', ',', 'against', 'all', ',', 'life', 'is', 'wonderful', 'because', 'there', 'are', 'always', 'people', 'like', 'Isabel', 'Allende', 'and', 'immortalize', 'just', 'Bille', 'August', '.'], 'label': 'pos'}

由于我们现在只有train/test这两个分类,所以我们需要创建一个新的validation set。我们可以使用.split()创建新的分类。
默认的数据分割是 70、30,如果我们声明split_ratio,可以改变split之间的比例,split_ratio=0.8表示80%的数据是训练集,20%是验证集。
我们还声明random_state这个参数,确保我们每次分割的数据集都是一样的。

import random
train_data, valid_data = train_data.split(random_state=random.seed(SEED))

检查一下现在每个部分有多少条数据。

print(f'Number of training examples: {len(train_data)}')
print(f'Number of validation examples: {len(valid_data)}')
print(f'Number of testing examples: {len(test_data)}')
Number of training examples: 17500
Number of validation examples: 7500
Number of testing examples: 25000

下一步我们需要创建 vocabulary 。vocabulary 就是把每个单词一一映射到一个数字。
我们使用最常见的25k个单词来构建我们的单词表,用max_size这个参数可以做到这一点。
所有其他的单词都用来表示。

# TEXT.build_vocab(train_data, max_size=25000)
# LABEL.build_vocab(train_data)
TEXT.build_vocab(train_data, max_size=25000, vectors="glove.6B.100d", unk_init=torch.Tensor.normal_)
LABEL.build_vocab(train_data)
print(f"Unique tokens in TEXT vocabulary: {len(TEXT.vocab)}")
print(f"Unique tokens in LABEL vocabulary: {len(LABEL.vocab)}")
Unique tokens in TEXT vocabulary: 25002
Unique tokens in LABEL vocabulary: 2

当我们把句子传进模型的时候,我们是按照一个个 batch 穿进去的,也就是说,我们一次传入了好几个句子,而且每个batch中的句子必须是相同的长度。为了确保句子的长度相同,TorchText会把短的句子pad到和最长的句子等长。
下面我们来看看训练数据集中最常见的单词。

print(TEXT.vocab.freqs.most_common(20))
[('the', 201455), (',', 192552), ('.', 164402), ('a', 108963), ('and', 108649), ('of', 100010), ('to', 92873), ('is', 76046), ('in', 60904), ('I', 54486), ('it', 53405), ('that', 49155), ('"', 43890), ("'s", 43151), ('this', 42454), ('-', 36769), ('/><br', 35511), ('was', 34990), ('as', 30324), ('with', 29691)]

我们可以直接用 stoi(string to int) 或者 itos (int to string) 来查看我们的单词表。

print(TEXT.vocab.itos[:10])
['<unk>', '<pad>', 'the', ',', '.', 'a', 'and', 'of', 'to', 'is']

查看labels。

print(LABEL.vocab.stoi)
defaultdict(<function _default_unk_index at 0x7fbec39a79d8>, {'neg': 0, 'pos': 1})

最后一步数据的准备是创建iterators。每个itartion都会返回一个batch的examples。
我们会使用BucketIterator。BucketIterator会把长度差不多的句子放到同一个batch中,确保每个batch中不出现太多的padding。
严格来说,我们这份notebook中的模型代码都有一个问题,也就是我们把也当做了模型的输入进行训练。更好的做法是在模型中把由产生的输出给消除掉。在这节课中我们简单处理,直接把也用作模型输入了。由于数量不多,模型的效果也不差。
如果我们有GPU,还可以指定每个iteration返回的tensor都在GPU上。

BATCH_SIZE = 64device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')train_iterator, valid_iterator, test_iterator = data.BucketIterator.splits((train_data, valid_data, test_data), batch_size=BATCH_SIZE,device=device)

Word Averaging模型
我们首先介绍一个简单的Word Averaging模型。这个模型非常简单,我们把每个单词都通过Embedding层投射成word embedding vector,然后把一句话中的所有word vector做个平均,就是整个句子的vector表示了。接下来把这个sentence vector传入一个Linear层,做分类即可。

我们使用avg_pool2d来做average pooling。我们的目标是把sentence length那个维度平均成1,然后保留embedding这个维度。

avg_pool2d的kernel size是 (embedded.shape[1], 1),所以句子长度的那个维度会被压扁。

import torch.nn as nn
import torch.nn.functional as Fclass WordAVGModel(nn.Module):def __init__(self, vocab_size, embedding_dim, output_dim, pad_idx):super().__init__()self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=pad_idx)self.fc = nn.Linear(embedding_dim, output_dim)def forward(self, text):embedded = self.embedding(text) # [sent len, batch size, emb dim]embedded = embedded.permute(1, 0, 2) # [batch size, sent len, emb dim]pooled = F.avg_pool2d(embedded, (embedded.shape[1], 1)).squeeze(1) # [batch size, embedding_dim]return self.fc(pooled)
INPUT_DIM = len(TEXT.vocab)
EMBEDDING_DIM = 100
OUTPUT_DIM = 1
PAD_IDX = TEXT.vocab.stoi[TEXT.pad_token]model = WordAVGModel(INPUT_DIM, EMBEDDING_DIM, OUTPUT_DIM, PAD_IDX)
def count_parameters(model):return sum(p.numel() for p in model.parameters() if p.requires_grad)print(f'The model has {count_parameters(model):,} trainable parameters')

The model has 2,500,301 trainable parameters

pretrained_embeddings = TEXT.vocab.vectors
model.embedding.weight.data.copy_(pretrained_embeddings)
tensor([[-0.1117, -0.4966,  0.1631,  ...,  1.2647, -0.2753, -0.1325],[-0.8555, -0.7208,  1.3755,  ...,  0.0825, -1.1314,  0.3997],[-0.0382, -0.2449,  0.7281,  ..., -0.1459,  0.8278,  0.2706],...,[-0.7244, -0.0186,  0.0996,  ...,  0.0045, -1.0037,  0.6646],[-1.1243,  1.2040, -0.6489,  ..., -0.7526,  0.5711,  1.0081],[ 0.0860,  0.1367,  0.0321,  ..., -0.5542, -0.4557, -0.0382]])
UNK_IDX = TEXT.vocab.stoi[TEXT.unk_token]model.embedding.weight.data[UNK_IDX] = torch.zeros(EMBEDDING_DIM)
model.embedding.weight.data[PAD_IDX] = torch.zeros(EMBEDDING_DIM)

训练模型

import torch.optim as optimoptimizer = optim.Adam(model.parameters())
criterion = nn.BCEWithLogitsLoss()
model = model.to(device)
criterion = criterion.to(device)

计算预测的准确率

def binary_accuracy(preds, y):"""Returns accuracy per batch, i.e. if you get 8/10 right, this returns 0.8, NOT 8"""#round predictions to the closest integerrounded_preds = torch.round(torch.sigmoid(preds))correct = (rounded_preds == y).float() #convert into float for division acc = correct.sum()/len(correct)return acc
def train(model, iterator, optimizer, criterion):epoch_loss = 0epoch_acc = 0model.train()for batch in iterator:optimizer.zero_grad()predictions = model(batch.text).squeeze(1)loss = criterion(predictions, batch.label)acc = binary_accuracy(predictions, batch.label)loss.backward()optimizer.step()epoch_loss += loss.item()epoch_acc += acc.item()return epoch_loss / len(iterator), epoch_acc / len(iterator)
def evaluate(model, iterator, criterion):epoch_loss = 0epoch_acc = 0model.eval()with torch.no_grad():for batch in iterator:predictions = model(batch.text).squeeze(1)loss = criterion(predictions, batch.label)acc = binary_accuracy(predictions, batch.label)epoch_loss += loss.item()epoch_acc += acc.item()return epoch_loss / len(iterator), epoch_acc / len(iterator)
import timedef epoch_time(start_time, end_time):elapsed_time = end_time - start_timeelapsed_mins = int(elapsed_time / 60)elapsed_secs = int(elapsed_time - (elapsed_mins * 60))return elapsed_mins, elapsed_secs
N_EPOCHS = 10best_valid_loss = float('inf')for epoch in range(N_EPOCHS):start_time = time.time()train_loss, train_acc = train(model, train_iterator, optimizer, criterion)valid_loss, valid_acc = evaluate(model, valid_iterator, criterion)end_time = time.time()epoch_mins, epoch_secs = epoch_time(start_time, end_time)if valid_loss < best_valid_loss:best_valid_loss = valid_losstorch.save(model.state_dict(), 'wordavg-model.pt')print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')print(f'\tTrain Loss: {train_loss:.3f} | Train Acc: {train_acc*100:.2f}%')print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Acc: {valid_acc*100:.2f}%')
Epoch: 01 | Epoch Time: 0m 2sTrain Loss: 0.685 | Train Acc: 56.84%Val. Loss: 0.622 |  Val. Acc: 71.09%
Epoch: 02 | Epoch Time: 0m 2sTrain Loss: 0.642 | Train Acc: 71.31%Val. Loss: 0.510 |  Val. Acc: 75.48%
Epoch: 03 | Epoch Time: 0m 2sTrain Loss: 0.573 | Train Acc: 78.31%Val. Loss: 0.449 |  Val. Acc: 79.52%
Epoch: 04 | Epoch Time: 0m 2sTrain Loss: 0.503 | Train Acc: 82.78%Val. Loss: 0.419 |  Val. Acc: 82.72%
Epoch: 05 | Epoch Time: 0m 2sTrain Loss: 0.440 | Train Acc: 85.84%Val. Loss: 0.408 |  Val. Acc: 84.75%
Epoch: 06 | Epoch Time: 0m 2sTrain Loss: 0.389 | Train Acc: 87.59%Val. Loss: 0.413 |  Val. Acc: 86.02%
Epoch: 07 | Epoch Time: 0m 2sTrain Loss: 0.352 | Train Acc: 88.85%Val. Loss: 0.425 |  Val. Acc: 86.92%
Epoch: 08 | Epoch Time: 0m 2sTrain Loss: 0.320 | Train Acc: 89.93%Val. Loss: 0.440 |  Val. Acc: 87.54%
Epoch: 09 | Epoch Time: 0m 2sTrain Loss: 0.294 | Train Acc: 90.74%Val. Loss: 0.456 |  Val. Acc: 88.09%
Epoch: 10 | Epoch Time: 0m 2sTrain Loss: 0.274 | Train Acc: 91.27%Val. Loss: 0.468 |  Val. Acc: 88.49%
import spacy
nlp = spacy.load('en')def predict_sentiment(sentence):tokenized = [tok.text for tok in nlp.tokenizer(sentence)]indexed = [TEXT.vocab.stoi[t] for t in tokenized]tensor = torch.LongTensor(indexed).to(device)tensor = tensor.unsqueeze(1)prediction = torch.sigmoid(model(tensor))return prediction.item()
predict_sentiment("This film is terrible")
5.568591932965664e-26
predict_sentiment("This film is great")
1.0

**

RNN模型

**
下面我们尝试把模型换成一个recurrent neural network (RNN)。RNN经常会被用来encode一个sequence
ht=RNN(xt,ht−1)

我们使用最后一个hidden state hT 来表示整个句子。
然后我们把 hT 通过一个线性变换 f ,然后用来预测句子的情感。

class RNN(nn.Module):def __init__(self, vocab_size, embedding_dim, hidden_dim, output_dim, n_layers, bidirectional, dropout, pad_idx):super().__init__()self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=pad_idx)self.rnn = nn.LSTM(embedding_dim, hidden_dim, num_layers=n_layers, bidirectional=bidirectional, dropout=dropout)self.fc = nn.Linear(hidden_dim*2, output_dim)self.dropout = nn.Dropout(dropout)def forward(self, text):embedded = self.dropout(self.embedding(text)) #[sent len, batch size, emb dim]output, (hidden, cell) = self.rnn(embedded)#output = [sent len, batch size, hid dim * num directions]#hidden = [num layers * num directions, batch size, hid dim]#cell = [num layers * num directions, batch size, hid dim]#concat the final forward (hidden[-2,:,:]) and backward (hidden[-1,:,:]) hidden layers#and apply dropouthidden = self.dropout(torch.cat((hidden[-2,:,:], hidden[-1,:,:]), dim=1)) # [batch size, hid dim * num directions]return self.fc(hidden.squeeze(0))
INPUT_DIM = len(TEXT.vocab)
EMBEDDING_DIM = 100
HIDDEN_DIM = 256
OUTPUT_DIM = 1
N_LAYERS = 2
BIDIRECTIONAL = True
DROPOUT = 0.5
PAD_IDX = TEXT.vocab.stoi[TEXT.pad_token]model = RNN(INPUT_DIM, EMBEDDING_DIM, HIDDEN_DIM, OUTPUT_DIM, N_LAYERS, BIDIRECTIONAL, DROPOUT, PAD_IDX)
print(f'The model has {count_parameters(model):,} trainable parameters')
The model has 4,810,857 trainable parameters
model.embedding.weight.data.copy_(pretrained_embeddings)
UNK_IDX = TEXT.vocab.stoi[TEXT.unk_token]model.embedding.weight.data[UNK_IDX] = torch.zeros(EMBEDDING_DIM)
model.embedding.weight.data[PAD_IDX] = torch.zeros(EMBEDDING_DIM)print(model.embedding.weight.data)
tensor([[ 0.0000,  0.0000,  0.0000,  ...,  0.0000,  0.0000,  0.0000],[ 0.0000,  0.0000,  0.0000,  ...,  0.0000,  0.0000,  0.0000],[-0.0382, -0.2449,  0.7281,  ..., -0.1459,  0.8278,  0.2706],...,[-0.7244, -0.0186,  0.0996,  ...,  0.0045, -1.0037,  0.6646],[-1.1243,  1.2040, -0.6489,  ..., -0.7526,  0.5711,  1.0081],[ 0.0860,  0.1367,  0.0321,  ..., -0.5542, -0.4557, -0.0382]],device='cuda:0')

训练RNN模型

optimizer = optim.Adam(model.parameters())
model = model.to(device)
N_EPOCHS = 5
best_valid_loss = float('inf')
for epoch in range(N_EPOCHS):start_time = time.time()train_loss, train_acc = train(model, train_iterator, optimizer, criterion)valid_loss, valid_acc = evaluate(model, valid_iterator, criterion)end_time = time.time()epoch_mins, epoch_secs = epoch_time(start_time, end_time)if valid_loss < best_valid_loss:best_valid_loss = valid_losstorch.save(model.state_dict(), 'lstm-model.pt')print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')print(f'\tTrain Loss: {train_loss:.3f} | Train Acc: {train_acc*100:.2f}%')print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Acc: {valid_acc*100:.2f}%')

Epoch: 01 | Epoch Time: 1m 29s
Train Loss: 0.676 | Train Acc: 57.69%
Val. Loss: 0.694 | Val. Acc: 53.40%
Epoch: 02 | Epoch Time: 1m 29s
Train Loss: 0.641 | Train Acc: 63.77%
Val. Loss: 0.744 | Val. Acc: 49.22%
Epoch: 03 | Epoch Time: 1m 29s
Train Loss: 0.618 | Train Acc: 65.77%
Val. Loss: 0.534 | Val. Acc: 73.72%
Epoch: 04 | Epoch Time: 1m 30s
Train Loss: 0.634 | Train Acc: 63.79%
Val. Loss: 0.619 | Val. Acc: 66.85%
Epoch: 05 | Epoch Time: 1m 29s
Train Loss: 0.448 | Train Acc: 79.19%
Val. Loss: 0.340 | Val. Acc: 86.63%
You may have noticed the loss is not really decreasing and the accuracy is poor. This is due to several issues with the model which we’ll improve in the next notebook.

Finally, the metric we actually care about, the test loss and accuracy, which we get from our parameters that gave us the best validation loss.

model.load_state_dict(torch.load('lstm-model.pt'))
test_loss, test_acc = evaluate(model, test_iterator, criterion)
print(f'Test Loss: {test_loss:.3f} | Test Acc: {test_acc*100:.2f}%')

CNN模型

class CNN(nn.Module):def __init__(self, vocab_size, embedding_dim, n_filters, filter_sizes, output_dim, dropout, pad_idx):super().__init__()self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=pad_idx)self.convs = nn.ModuleList([nn.Conv2d(in_channels = 1, out_channels = n_filters, kernel_size = (fs, embedding_dim)) for fs in filter_sizes])self.fc = nn.Linear(len(filter_sizes) * n_filters, output_dim)self.dropout = nn.Dropout(dropout)def forward(self, text):text = text.permute(1, 0) # [batch size, sent len]embedded = self.embedding(text) # [batch size, sent len, emb dim]embedded = embedded.unsqueeze(1) # [batch size, 1, sent len, emb dim]conved = [F.relu(conv(embedded)).squeeze(3) for conv in self.convs]#conv_n = [batch size, n_filters, sent len - filter_sizes[n]]pooled = [F.max_pool1d(conv, conv.shape[2]).squeeze(2) for conv in conved]#pooled_n = [batch size, n_filters]cat = self.dropout(torch.cat(pooled, dim=1))#cat = [batch size, n_filters * len(filter_sizes)]return self.fc(cat)
INPUT_DIM = len(TEXT.vocab)
EMBEDDING_DIM = 100
N_FILTERS = 100
FILTER_SIZES = [3,4,5]
OUTPUT_DIM = 1
DROPOUT = 0.5
PAD_IDX = TEXT.vocab.stoi[TEXT.pad_token]model = CNN(INPUT_DIM, EMBEDDING_DIM, N_FILTERS, FILTER_SIZES, OUTPUT_DIM, DROPOUT, PAD_IDX)
model.embedding.weight.data.copy_(pretrained_embeddings)
UNK_IDX = TEXT.vocab.stoi[TEXT.unk_token]model.embedding.weight.data[UNK_IDX] = torch.zeros(EMBEDDING_DIM)
model.embedding.weight.data[PAD_IDX] = torch.zeros(EMBEDDING_DIM)
model = model.to(device)
optimizer = optim.Adam(model.parameters())
criterion = nn.BCEWithLogitsLoss()
criterion = criterion.to(device)N_EPOCHS = 5best_valid_loss = float('inf')for epoch in range(N_EPOCHS):start_time = time.time()train_loss, train_acc = train(model, train_iterator, optimizer, criterion)valid_loss, valid_acc = evaluate(model, valid_iterator, criterion)end_time = time.time()epoch_mins, epoch_secs = epoch_time(start_time, end_time)if valid_loss < best_valid_loss:best_valid_loss = valid_losstorch.save(model.state_dict(), 'CNN-model.pt')print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')print(f'\tTrain Loss: {train_loss:.3f} | Train Acc: {train_acc*100:.2f}%')print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Acc: {valid_acc*100:.2f}%')

Epoch: 01 | Epoch Time: 0m 11s
Train Loss: 0.645 | Train Acc: 62.12%
Val. Loss: 0.485 | Val. Acc: 79.61%
Epoch: 02 | Epoch Time: 0m 11s
Train Loss: 0.423 | Train Acc: 80.59%
Val. Loss: 0.360 | Val. Acc: 84.63%
Epoch: 03 | Epoch Time: 0m 11s
Train Loss: 0.302 | Train Acc: 87.33%
Val. Loss: 0.320 | Val. Acc: 86.59%
Epoch: 04 | Epoch Time: 0m 11s
Train Loss: 0.222 | Train Acc: 91.20%
Val. Loss: 0.306 | Val. Acc: 87.17%
Epoch: 05 | Epoch Time: 0m 11s
Train Loss: 0.161 | Train Acc: 93.99%
Val. Loss: 0.325 | Val. Acc: 86.82%

model.load_state_dict(torch.load('CNN-model.pt'))
test_loss, test_acc = evaluate(model, test_iterator, criterion)
print(f'Test Loss: {test_loss:.3f} | Test Acc: {test_acc*100:.2f}%')

Test Loss: 0.336 | Test Acc: 85.66%

pytorch第四课相关推荐

  1. 新手第四课-PaddlePaddle快速入门

    新手第四课-PaddlePaddle快速入门 文章目录 新手第四课-PaddlePaddle快速入门 PaddlePaddle基础命令 计算常量的加法:1+1 计算变量的加法:1+1 使用Paddle ...

  2. TensoRT量化第四课:PTQ与QAT

    目录 PTQ与QAT 注意事项 一.2023/5/8更新 二.2023/5/12更新 前言 1. TensorRT量化 2. PTQ 3. QAT 4. QAT实战 4.1 环境配置 4.2 pyto ...

  3. 孙鑫mfc学习笔记第十四课

    第十四课 网络的相关知识,网络程序的编写,Socket是连接应用程序与网络驱动程序的桥梁,Socket在应用程序中创建,通过bind与驱动程序建立关系.此后,应用程序送给Socket的数据,由Sock ...

  4. 第四课.LinuxShell编程

    第四课目录 什么是Shell Shell编程 创建脚本 注释 Shell变量 基本运算 字符串,数组,分支循环,函数 应用实例 猜数字 获取CPU使用情况 探测本地网络 什么是Shell 有人说Lin ...

  5. 第四课:算法效率的度量和存储空间需求

    第四课 本课主题: 算法效率的度量和存储空间需求 教学目的: 掌握算法的渐近时间复杂度和空间复杂度的意义与作用 教学重点: 渐近时间复杂度的意义与作用及计算方法 教学难点: 渐近时间复杂度的意义 授课 ...

  6. NeHe OpenGL第二十四课:扩展

    NeHe OpenGL第二十四课:扩展 扩展,剪裁和TGA图像文件的加载: 在这一课里,你将学会如何读取你显卡支持的OpenGL的扩展,并在你指定的剪裁区域把它显示出来.   这个教程有一些难度,但它 ...

  7. 0.0 目录-深度学习第四课《卷积神经网络》-Stanford吴恩达教授

    文章目录 第五课 第四课 第三课 第二课 第一课 第五课 <序列模型> 笔记列表 Week 1 循环序列模型 Week 1 传送门 -> 1.1 为什么选择序列模型 1.2 数学符号 ...

  8. Coursera公开课笔记: 斯坦福大学机器学习第四课“多变量线性回归(Linear Regression with Multiple Variables)”

    Coursera公开课笔记: 斯坦福大学机器学习第四课"多变量线性回归(Linear Regression with Multiple Variables)" 斯坦福大学机器学习第 ...

  9. 计算机病毒ppt教案免费,第十四课 计算机病毒 课件(共14张ppt)+教案

    第十四课 计算机病毒 课件(共14张ppt)+教案 ==================资料简介====================== 第十四课 计算机病毒 课件:14张PPT 第十四课 计算机 ...

最新文章

  1. linux 远程主机发送消息,linux – rsyslog不会将远程消息写入特定主机的日志文件...
  2. VS2010/MFC编程入门之二十九(常用控件:列表视图控件List Control 下)
  3. u3d 模版测试 失败_基于Python的HTTP接口自动化测试框架实现
  4. arm11搭建Linux平台,armlinux软硬件平台搭建.doc
  5. 物联网核心安全系列——智能监控安全问题
  6. 全栈溯源、mAPM、金融性能、Oracle VS. MySQL:看APM技术专场有哪些干货
  7. 理解 with递归调用 Sqlserver 树查询
  8. vue中有汉字和数字。怎么截取汉字保留数字_Excel提取中文,数字和字母,一分钟搞定!...
  9. 最新单片机毕业设计题目大全
  10. 微信群解答_各月各旬降水量均值_巧用groupby
  11. 海关179对接问题及解决办法大集锦
  12. 宁宛 机器人_.001 忠犬机器人
  13. 如何打开计算机用户账户控制面板,控制面板无法打开用户帐户
  14. TCC(新加坡太一国际数字交易所):升值万倍的数字资产 成就多少亿万豪
  15. silvaco学习日记(四)
  16. 百度地图获取河流_开车旅行少不了地图导航,高德地图和百度地图,你觉得哪个好用?...
  17. mysql admin php_apache+php+mysql+phpadmin 服务环境搭建
  18. android全平台基于ffmpeg解码本地MP4视频推流到RTMP服务器
  19. 用户分析体系,该如何搭建
  20. 对当前两种热门软件创新性分析

热门文章

  1. formidable词根词缀_实用文档之英语单词(词根词缀)
  2. 修复液晶显示器屏幕上的划痕
  3. 85后独立手游开发者专访:为游戏而坚持
  4. Java就业方向和自学提升方法总结黑马就业班资源分享
  5. 【t006】三角形分形描绘问题
  6. 从EXCEL的超链接中批量提取图片
  7. 【读书】《禅与摩托车维修艺术》读后感
  8. matplotlib:使用emoji字体实现简易象形图
  9. 前端开发者应该知道的 Centos/Docker/Nginx/Node/Jenkins 操作
  10. 美容仪皮秒机Lumina电源维修CCPF-4000