本教程译文的上一部分，请见我的上一篇博文：

Stanford CS224N: PyTorch Tutorial (Winter ‘21) —— 斯坦福CS224N PyTorch教程（第二部分）_放肆荒原的博客-CSDN博客

演示：词窗分类一（Demo: Word Window Classification I）

我们已经学习了PyTorch的基本原理，并建立了解决Toy任务的基本网络。现在，我们将尝试解决一个示例NLP任务。以下是我们会学到的内容：

数据：创建批量张量数据集(Data: Creating a Dataset of Batched Tensors)
建模(Modeling)
训练(Training)
预测(Prediction)

在本节中，我们的目标是训练一个模型，该模型将在语句中找到与地点( LOCATION) 对应的单词，该 LOCATION 将始终具有跨度 1（这意味着 San Fransisco 不会被识别为 LOCATION）。这个任务被称为词窗分类是有原因的，与其让模型在每次前向走过时只查看一个单词，我们更希望它能够考虑相关单词的上下文。也就是说，对于每个单词，我们希望我们的模型知道周围的单词。来吧！

数据(Data)

任何机器学习项目的首要任务是建立训练集，通常我们会使用一个训练语料库。在 NLP 任务中，语料库通常是一个 .txt 或 .csv 文件，其中每一行对应一个句子或一个表格数据点。在我们的Toy任务中，我们假设已经将数据和相应的标签读入到Python 列表中了。

In [71]:

# Our raw data, which consists of sentences
# 我们的原始数据，由句子组成
corpus = ["We always come to Paris","The professor is from Australia","I live in Stanford","He comes from Taiwan","The capital of Turkey is Ankara"]

预处理(Preprocessing)

为了让模型更容易去学习，我们通常会对数据进行一些预处理。这在处理文本数据时尤其重要。下面是一些文本预处理的例子：

分词标记(Tokenization)：将句子标记为单词。
小写(Lowercasing)：将所有字母改为小写。
噪声去除(Noise removal)：去除特殊字符（如标点符号）。
停用词去除(Stop words removal)：去除常用词(译者注：指去除句子中常用的辅助性单词)。

需要哪些预处理步骤取决于手头的任务。例如，虽然在某些任务中删除特殊字符很有用，但对于其他任务，它们可能很重要（例如，如果我们处理的是多语种）。对于我们的任务，我们将把单词小写并分词。

In [72]:

# The preprocessing function we will use to generate our training examples
# Our function is a simple one, we lowercase the letters
# and then tokenize the words.
# 用于生成训练示例的预处理函数
# 函数很简单，将字母小写，然后对单词进行分词。
def preprocess_sentence(sentence):return sentence.lower().split()# Create our training set
train_sentences = [sent.lower().split() for sent in corpus]
train_sentences

Out [72]:

[['we', 'always', 'come', 'to', 'paris'],['the', 'professor', 'is', 'from', 'australia'],['i', 'live', 'in', 'stanford'],['he', 'comes', 'from', 'taiwan'],['the', 'capital', 'of', 'turkey', 'is', 'ankara']]

对于我们拥有的每个训练示例，我们还应该有一个相应的标签。回想一下，我们模型的目标是确定哪些词对应于 LOCATION。也就是说，我们希望我们的模型为所有不是 LOCATION 的词输出 0，为 LOCATION 的词输出 1。

In [73]:

# Set of locations that appear in our corpus
# 出现在我们语料库中的一组位置
locations = set(["australia", "ankara", "paris", "stanford", "taiwan", "turkey"])# Our train labels
# 训练标签
train_labels = [[1 if word in locations else 0 for word in sent] for sent in train_sentences]
train_labels

Out [73]:

[[0, 0, 0, 0, 1],[0, 0, 0, 0, 1],[0, 0, 0, 1],[0, 0, 0, 1],[0, 0, 0, 1, 0, 1]]

将单词转换为词嵌入

让我们更仔细地看看训练数据，我们拥有的每个数据点都是一个单词序列。另一方面，我们知道机器学习模型处理的是向量中的数字。我们如何将文字转化为数字？您可能正在考虑词嵌入，您是对的！

想象一下，我们有一个嵌入查找表 E，其中每一行对应一个嵌入。也就是说，我们词汇表中的每个单词在这个表中都有一个对应的嵌入行 i。每当我们想找到一个词的嵌入时，我们将遵循以下步骤：

在嵌入表中找到单词对应的索引i：word->index。
索引嵌入表并获得嵌入：index->embedding。

我们来看第一步。我们应该将词汇表中的所有单词分配给相应的索引。我们可以这样做：

在我们的语料库中找到所有唯一的词。
为每个分配一个索引。

In [74]:

# Find all the unique words in our corpus
# 在我们的语料库中找到所有唯一的词（译者注：使用set去重）
vocabulary = set(w for s in train_sentences for w in s)
vocabulary

Out [74]:

{'always','ankara','australia','capital','come','comes','from','he','i','in','is','live','of','paris','professor','stanford','taiwan','the','to','turkey','we'}

词汇现在包含我们语料库中的所有单词。另一方面，在测试期间，我们可以看到词汇表中未包含的单词。如果我们能找到一种表示未知单词的方法，我们的模型仍然可以推断它们是否是 LOCATION，因为我们还在查看每个预测的相邻单词。

我们引入了一个特殊的标记 <unk> 来处理超出词汇表的单词。如果需要的话，我们可以为未知标记选择别的字符串。唯一的要求是我们的标记应该是唯一的：我们应该只将这个标记用于未知单词。我们还会将此特殊标记添加到我们的词汇表中。

In [75]:

# Add the unknown token to our vocabulary
# 将未知标记添加到词汇表中
vocabulary.add("<unk>")

前面我们提到我们的任务被称为词窗口分类，因为我们的模型在需要进行预测时，除了给定的词之外，还会查看周围的词。

例如，让我们以句子“We always come to Paris”为例。这句话对应的训练标签是 0, 0, 0, 0, 1 因为只有最后一个词 Paris 是一个 LOCATION。在一次传递中（意味着调用 forward()），我们的模型将尝试为一个单词生成正确的标签。假设我们的模型试图为巴黎生成正确的标签 1，如果我们只让我们的模型看到巴黎，而没有看到其他任何东西，我们将错过经常与 LOCATION 一起出现的单词 to 这个重要信息。

词窗允许我们的模型在进行预测时考虑每个词周围的 +N 或 -N 个词。在我们之前的 Paris 示例中，如果我们的窗口大小为 1，这意味着我们的模型将查看紧接在 Paris 之前和之后出现的单词，这些单词是 to，好吧，没了。现在这引发了另一个问题，Paris 位于我们句子的末尾，因此后面没有其他词。请记住，我们在初始化 PyTorch 模型时定义了它们的输入的维度。如果我们将窗口大小设置为 1，则意味着我们的模型将在每次传递中接受 3 个单词。我们不能让我们的模型时不时的遇到 2 个词。

解决方案是引入一个特殊的标记，例如 <pad>，它将被添加到我们的句子中，以确保每个单词周围都有一个有效的窗口。与 <unk> 标记类似，如果我们愿意，我们可以为我们的 pad 令牌选择另一个字符串，只要我们确保它用于一个独特的目的。

In [76]:

# Add the <pad> token to our vocabulary
# 将 <pad> 标记添加到词表中
vocabulary.add("<pad>")# Function that pads the given sentence
# We are introducing this function here as an example
# We will be utilizing it later in the tutorial
# 填充给定句子的函数
# 我们这里引入这个函数做个例子
# 后面的教程中会使用它
def pad_window(sentence, window_size, pad_token="<pad>"):window = [pad_token] * window_sizereturn window + sentence + window# Show padding example
# 显示填充示例
window_size = 2
pad_window(train_sentences[0], window_size=window_size)

Out [76]:

['<pad>', '<pad>', 'we', 'always', 'come', 'to', 'paris', '<pad>', '<pad>']

现在词汇准备好了，我们为每个词分配一个索引。

In [77]:

# We are just converting our vocabularly to a list to be able to index into it
# Sorting is not necessary, we sort to show an ordered word_to_ind dictionary
# That being said, we will see that having the index for the padding token
# be 0 is convenient as some PyTorch functions use it as a default value
# such as nn.utils.rnn.pad_sequence, which we will cover in a bit
# 我们只是将我们的词汇转换为一个列表，以便能够对其进行索引
# 排序不是必须的，我们排序是为了显示一个有序的 word_to_ind 字典
# 我们将看到将填充标记的索引设为 0 很方便，因为某些 PyTorch 函数将其用作默认值，
# 例如 nn.utils.rnn.pad_sequence，我们将稍后介绍
ix_to_word = sorted(list(vocabulary))# Creating a dictionary to find the index of a given word
# 创建一个字典来查找给定单词的索引
word_to_ix = {word: ind for ind, word in enumerate(ix_to_word)}
word_to_ix

Out [77]:

{'<pad>': 0,'<unk>': 1,'always': 2,'ankara': 3,'australia': 4,'capital': 5,'come': 6,'comes': 7,'from': 8,'he': 9,'i': 10,'in': 11,'is': 12,'live': 13,'of': 14,'paris': 15,'professor': 16,'stanford': 17,'taiwan': 18,'the': 19,'to': 20,'turkey': 21,'we': 22}

好了！准备将训练句子转换为与每个标记对应的索引序列。

In [78]:

# Given a sentence of tokens, return the corresponding indices
# 给定一个标记的句子，返回对应的索引
def convert_token_to_indices(sentence, word_to_ix):indices = []for token in sentence:# Check if the token is in our vocabularly. If it is, get it's index. # If not, get the index for the unknown token.# 检查令牌是否在我们的词汇表中。 如果是，获取它的索引。# 如果没有，获取未知标记的索引。if token in word_to_ix:index = word_to_ix[token]else:index = word_to_ix["<unk>"]indices.append(index)return indices# More compact version of the same function
# 相同功能的更紧凑版本
def _convert_token_to_indices(sentence, word_to_ix):return [word_to_ind.get(token, word_to_ix["<unk>"]) for token in sentence]# Show an example
# 展示一个例子
example_sentence = ["we", "always", "come", "to", "kuwait"]
example_indices = convert_token_to_indices(example_sentence, word_to_ix)
restored_example = [ix_to_word[ind] for ind in example_indices]print(f"Original sentence is: {example_sentence}")
print(f"Going from words to indices: {example_indices}")
print(f"Going from indices to words: {restored_example}")

Original sentence is: ['we', 'always', 'come', 'to', 'kuwait']
Going from words to indices: [22, 2, 6, 20, 1]
Going from indices to words: ['we', 'always', 'come', 'to', '<unk>']

在上面的例子中，kuwait 显示为 <unk>，因为它不包括在词汇表中。我们把 train_sentences 转换为 example_padded_indices。

In [79]:

# Converting our sentences to indices
# 将我们的句子转换为索引
example_padded_indices = [convert_token_to_indices(s, word_to_ix) for s in train_sentences]
example_padded_indices

Out [79]:

[[22, 2, 6, 20, 15],[19, 16, 12, 8, 4],[10, 13, 11, 17],[9, 7, 8, 18],[19, 5, 14, 21, 12, 3]]

现在我们有了词汇表中每个单词的索引，我们可以在 PyTorch 中创建一个带有 nn.Embedding 类的嵌入表：nn.Embedding(num_words, embedding_dimension) ，其中 num_words 是我们词汇表中的单词数，embedding_dimension 是我们想要的嵌入维度。 nn.Embedding 没有什么特别之处：它只是一个围绕 NxE 维可训练张量的包装类，其中 N 是我们词汇表中的单词数，E 是嵌入维度的数量。这张表最初是随机的，但会随着时间的推移而改变。当我们训练我们的网络时，梯度将一直反向传播到嵌入层，因此我们的词嵌入将被更新。我们将在我们的模型中初始化我们将用于我们的模型的嵌入层，但这里先展示一个例子。

In [80]:

# Creating an embedding table for our words
# 为我们的单词创建一个嵌入表
embedding_dim = 5
embeds = nn.Embedding(len(vocabulary), embedding_dim)# Printing the parameters in our embedding table
# 打印嵌入表中的参数
list(embeds.parameters())

Out [80]:

[Parameter containing:tensor([[-0.5421,  0.6919,  0.8236, -1.3510,  1.4048],[ 1.2983,  1.4740,  0.1002, -0.5475,  1.0871],[ 1.4604, -1.4934, -0.4363, -0.3231, -1.9746],[ 0.8021,  1.5121,  0.8239,  0.9865, -1.3801],[ 0.3502, -0.5920,  0.9295,  0.6062, -0.6258],[ 0.5038, -1.0187,  0.2860,  0.3231, -1.2828],[ 1.5232, -0.5983, -0.4971, -0.5137,  1.4319],[ 0.3826,  0.6501, -0.3948,  1.3998, -0.5133],[-0.1728, -0.7658,  0.2873, -2.1812,  0.9506],[-0.5617,  0.4552,  0.0618, -1.7503,  0.2192],[-0.5405,  0.7887, -0.9843, -0.6110,  0.6391],[ 0.6581, -0.7067,  1.3208,  1.3860, -1.5113],[ 1.1594,  0.4977, -1.9175,  0.0916,  0.0085],[ 0.3317,  1.8169,  0.0802, -0.1456, -0.7304],[ 0.4997, -1.4895,  0.1237, -0.4121,  0.8909],[ 0.6732,  0.4117, -0.5378,  0.6632, -2.7096],[-0.4580, -0.9436, -1.6345,  0.1284, -1.6147],[-0.3537,  1.9635,  1.0702, -0.1894, -0.8822],[-0.4057, -1.2033, -0.7083,  0.4087, -1.1708],[-0.6373,  0.5272,  1.8711, -0.5865, -0.7643],[ 0.4714, -2.5822,  0.4338,  0.1537, -0.7650],[-2.1828,  1.3178,  1.3833,  0.5018, -1.7209],[-0.5354,  0.2153, -0.1482,  0.3903,  0.0900]], requires_grad=True)]

为了在词汇表中获得一个词的词嵌入，我们需要做的就是创建一个查找张量。查找张量只是一个包含了我们要在nn.Embedding 类中查找的索引张量，并期望它是一个 Long Tensor 类型的索引张量，因此我们应该相应地创建该张量。

In [81]:

# Get the embedding for the word Paris
# 获取单词巴黎的嵌入
index = word_to_ix["paris"]
index_tensor = torch.tensor(index, dtype=torch.long)
paris_embed = embeds(index_tensor)
paris_embed

Out [81]:

tensor([ 0.6732,  0.4117, -0.5378,  0.6632, -2.7096],grad_fn=<EmbeddingBackward>)

In [82]:

# We can also get multiple embeddings at once
# 我们也可以一次得到多个嵌入
index_paris = word_to_ix["paris"]
index_ankara = word_to_ix["ankara"]
indices = [index_paris, index_ankara]
indices_tensor = torch.tensor(indices, dtype=torch.long)
embeddings = embeds(indices_tensor)
embeddings

Out [82]:

tensor([[ 0.6732,  0.4117, -0.5378,  0.6632, -2.7096],[ 0.8021,  1.5121,  0.8239,  0.9865, -1.3801]],grad_fn=<EmbeddingBackward>)

通常，我们将嵌入层定义为模型的一部分，您将在本notebook的后面部分中看到。

批处理语句(Batching Sentences)

我们在课堂上学习了批处理，在更新之前等待处理整个训练语料库是经常性的。另一方面，在每个训练示例之后更新参数会导致更新之间的损失不太稳定。为了解决这些问题，我们改为在对一批数据进行训练之后再更新我们的参数。这使我们能够更好地估计全局损失的梯度。在本节中，我们将学习如何使用 torch.util.data.DataLoader 类将我们的数据组织成批次。

我们按如下方式调用 DataLoader 类：DataLoader(data, batch_size=batch_size, shuffle=True, collate_fn=collate_fn)。 batch_size 参数确定每批的样本数。在每个epoch，我们将使用 DataLoader 迭代所有批次。默认情况下，批次的顺序是确定性的，但我们可以通过将 shuffle 参数设置为 True 来要求 DataLoader 对批次进行乱序。这样我们就可以确保我们不会多次遇到坏批次。

如果提供，DataLoader 会将它准备的批次传递给 collate_fn。我们可以编写一个自定义函数来传递给 collate_fn 参数，以便打印有关我们批次的统计信息或执行额外的处理。在我们的例子中，我们将使用 collate_fn 来：

窗口式填充我们的训练语句。
将训练样本中的单词转换为索引。
填充训练样本，使所有句子和标签具有相同的长度。同样，我们也需要填充标签。这会产生一个问题，因为在计算损失时，我们需要知道给定样本中的实际单词数。我们还要在传递给 collate_fn 参数的函数中跟踪这个数字。

因为我们的 collate_fn 函数版本需要访问我们的 word_to_ix 字典（以便它可以将单词转换为索引），所以我们将使用 Python 中的部分函数，它将我们提供的参数传递给函数。

In [83]:

from torch.utils.data import DataLoader
from functools import partialdef custom_collate_fn(batch, window_size, word_to_ix):# Break our batch into the training examples (x) and labels (y)# We are turning our x and y into tensors because nn.utils.rnn.pad_sequence# method expects tensors. This is also useful since our model will be# expecting tensor inputs. # 将我们的批次拆分成训练样本(x)和标签(y)# 将 x 和 y 转换为张量，因为 nn.utils.rnn.pad_sequence 方法需要张量。# 这很有用，模型也需要张量输入。x, y = zip(*batch)# Now we need to window pad our training examples. We have already defined a # function to handle window padding. We are including it here again so that# everything is in one place.# 现在我们需要对我们的训练样本进行窗口填充。 # 我们已经定义了一个函数来处理窗口填充。 我们再次将它包含在这里，以便一切都在一个地方。def pad_window(sentence, window_size, pad_token="<pad>"):window = [pad_token] * window_sizereturn window + sentence + window# Pad the train examples.# 填充训练样本。x = [pad_window(s, window_size=window_size) for s in x]# Now we need to turn words in our training examples to indices. We are# copying the function defined earlier for the same reason as above.# 现在我们需要将训练样本中的单词转换为索引。 # 出于与之前相同的原因，我们复制之前定义的函数。def convert_tokens_to_indices(sentence, word_to_ix):return [word_to_ix.get(token, word_to_ix["<unk>"]) for token in sentence]# Convert the train examples into indices.# 将训练样本转换为索引。x = [convert_tokens_to_indices(s, word_to_ix) for s in x]# We will now pad the examples so that the lengths of all the example in # one batch are the same, making it possible to do matrix operations. # We set the batch_first parameter to True so that the returned matrix has # the batch as the first dimension.# 我们现在将填充样本，以便一批中所有样本的长度相同，从而可以进行矩阵运算。# 我们将batch_first 参数设置为True，以便返回的矩阵以batch作为第一维。pad_token_ix = word_to_ix["<pad>"]# pad_sequence function expects the input to be a tensor, so we turn x into one# pad_sequence 函数期望的输入是张量，所以我们把 x 变成 1x = [torch.LongTensor(x_i) for x_i in x]x_padded = nn.utils.rnn.pad_sequence(x, batch_first=True, padding_value=pad_token_ix)# We will also pad the labels. Before doing so, we will record the number # of labels so that we know how many words existed in each example. # 我们还要填充标签。在此之前，我们要记录标签的数量，以便知道每个样本中存在多少个单词。lengths = [len(label) for label in y]lenghts = torch.LongTensor(lengths)y = [torch.LongTensor(y_i) for y_i in y]y_padded = nn.utils.rnn.pad_sequence(y, batch_first=True, padding_value=0)# We are now ready to return our variables. The order we return our variables# here will match the order we read them in our training loop.# 我们现在准备返回变量。在此处返回变量的顺序与我们在训练循环中读取它们的顺序相匹配。return x_padded, y_padded, lenghts

这个函数看起来很长，但其实没必要。查看下面的替代版本，我们删除了额外的函数声明和注释。

In [84]:

def _custom_collate_fn(batch, window_size, word_to_ix):# Prepare the datapoints# 准备数据点x, y = zip(*batch)  x = [pad_window(s, window_size=window_size) for s in x]x = [convert_tokens_to_indices(s, word_to_ix) for s in x]# Pad x so that all the examples in the batch have the same size# 填充 x 使批次中的所有样本具有相同的大小pad_token_ix = word_to_ix["<pad>"]x = [torch.LongTensor(x_i) for x_i in x]x_padded = nn.utils.rnn.pad_sequence(x, batch_first=True, padding_value=pad_token_ix)# Pad y and record the length# 填充 y 并记录长度lengths = [len(label) for label in y]lenghts = torch.LongTensor(lengths)y = [torch.LongTensor(y_i) for y_i in y]y_padded = nn.utils.rnn.pad_sequence(y, batch_first=True, padding_value=0)return x_padded, y_padded, lenghts

现在，我们可以看到 DataLoader 正在运行。

In [85]:

# Parameters to be passed to the DataLoader
# 传递给DataLoader的参数
data = list(zip(train_sentences, train_labels))
batch_size = 2
shuffle = True
window_size = 2
collate_fn = partial(custom_collate_fn, window_size=window_size, word_to_ix=word_to_ix)# Instantiate the DataLoader
# 实例化DataLoader
loader = DataLoader(data, batch_size=batch_size, shuffle=shuffle, collate_fn=collate_fn)# Go through one loop
# 遍历一个循环
counter = 0
for batched_x, batched_y, batched_lengths in loader:print(f"Iteration {counter}")print("Batched Input:")print(batched_x)print("Batched Labels:")print(batched_y)print("Batched Lengths:")print(batched_lengths)print("")counter += 1

Iteration 0
Batched Input:
tensor([[ 0,  0, 22,  2,  6, 20, 15,  0,  0],[ 0,  0, 19, 16, 12,  8,  4,  0,  0]])
Batched Labels:
tensor([[0, 0, 0, 0, 1],[0, 0, 0, 0, 1]])
Batched Lengths:
tensor([5, 5])Iteration 1
Batched Input:
tensor([[ 0,  0, 19,  5, 14, 21, 12,  3,  0,  0],[ 0,  0, 10, 13, 11, 17,  0,  0,  0,  0]])
Batched Labels:
tensor([[0, 0, 0, 1, 0, 1],[0, 0, 0, 1, 0, 0]])
Batched Lengths:
tensor([6, 4])Iteration 2
Batched Input:
tensor([[ 0,  0,  9,  7,  8, 18,  0,  0]])
Batched Labels:
tensor([[0, 0, 0, 1]])
Batched Lengths:
tensor([4])

在上面看到的批处理的输入张量被传递到了我们的模型中。另一方面，我们在本文开头说我们的模型是一个窗口分类器。目前我们输入张量的方式是格式化的，我们在一个数据点中包含一个句子中的所有单词。当我们将此输入传递给我们的模型时，它需要为每个词创建窗口，对每个窗口的中心词是否为 LOCATION 进行预测，将预测放在一起并返回。

如果我们事先将数据分成多个窗口来格式化数据，就可以避免这个问题。在这个例子中，我们将换一下模型格式化的方式。

鉴于我们的 window_size 是 N，我们希望模型对每 2N+1 个标记进行预测。也就是说，如果我们有一个包含 9 个标记的输入，并且 window_size 为 2，我们希望模型返回 5 个预测。这是有道理的，因为在我们在每侧填充 2 个标记之前，输入中也有 5 个标记！

我们可以通过使用 for 循环来创建这些窗口，但有一个更快的 PyTorch 替代方法，即unfold(dimension, size, step)方法。我们可以使用这个方法创建我们需要的窗口，如下所示：

In [86]:

# Print the original tensor
# 打印原始张量
print(f"Original Tensor: ")
print(batched_x)
print("")# Create the 2 * 2 + 1 chunks
# 创建 2 * 2 + 1 块
chunk = batched_x.unfold(1, window_size*2 + 1, 1)
print(f"Windows: ")
print(chunk)

Original Tensor:
tensor([[ 0,  0,  9,  7,  8, 18,  0,  0]])Windows:
tensor([[[ 0,  0,  9,  7,  8],[ 0,  9,  7,  8, 18],[ 9,  7,  8, 18,  0],[ 7,  8, 18,  0,  0]]])

（第四部分：演示：词窗分类二（Demo: Word Window Classification II））

Stanford CS224N: PyTorch Tutorial (Winter ‘21) —— 斯坦福CS224N PyTorch教程（第三部分）相关推荐

Stanford CS224N: PyTorch Tutorial (Winter ‘21) —— 斯坦福CS224N PyTorch教程（第二部分）
本教程译文的第一部分,请见我的上一篇博文: Stanford CS224N: PyTorch Tutorial (Winter '21) -- 斯坦福CS224N PyTorch教程 (第一部分)_放 ...
第四期 | 带学斯坦福CS224n自然语言处理课+带打全球Kaggle比赛（文末重金招募老师！）...
在未来五年,你认为人工智能领域最具有商业价值的方向是什么? 上个月我和一个算法工程师朋友聊了聊,询问算法岗的行业薪资,他说现在计算机视觉算法岗工程师年薪大约50万左右,正当我感叹如今计算机视觉的火爆时 ...
斯坦福CS224n追剧计划【大结局】：NLP和深度学习的未来
一只小狐狸带你解锁炼丹术&NLP秘籍简介 Stanford CS224n追剧计划是由夕小瑶的卖萌屋发起的开源开放NLP入门项目,借助github和微信群为大家提供同期小伙伴打卡讨论.内容沉淀 ...
斯坦福cs224n教程--- 学习笔记1
一.前言自然语言是人类智慧的结晶,自然语言处理是人工智能中最为困难的问题之一,而对自然语言处理的研究也是充满魅力和挑战的. 通过经典的斯坦福cs224n教程,让我们一起和自然语言处理共舞!也希望大家 ...
斯坦福 CS224n 中文笔记整理活动 | ApacheCN
参与方式:https://github.com/apachecn/stanford-cs224n-notes-zh/blob/master/CONTRIBUTING.md 整体进度:https://g ...
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning代码调试（跑通）
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning代码调试(跑通) 文章目录 Show, Attend, and Tell ...
PyTorch Tutorial
目录图像.视觉.CNN相关实现对抗生成网络.生成模型.GAN相关实现机器翻译.问答系统.NLP相关实现先进视觉推理系统深度强化学习相关实现通用神经网络高级应用图像.视觉.CNN相关实现 ...
【李宏毅2020 ML/DL】P16 PyTorch Tutorial | 最后提及了 apex.amp
我已经有两年 ML 经历,这系列课主要用来查缺补漏,会记录一些细节的.自己不知道的东西. 已经有人记了笔记(很用心,强烈推荐): https://github.com/Sakura-gh/ML-not ...
全网顶尖，毫不夸张的说这份斯坦福大学机器学习教程中文笔记，能让你机器学习从入门到精通
人工智能人工智能,无疑是现在最火的概念,从AlphaGo打败李世石后,全世界掀起了一股人工智能的浪潮.我们在生活中的各个角落,都能感受到一个崭新的奇点时代即将来临人工智能充气鞋垫人工智能马桶 ( ...
ChatGPT is not all you need，一文看尽SOTA生成式AI模型：6大公司9大类别21个模型全回顾（三）
文章目录 ChatGPT is not all you need,一文看尽SOTA生成式AI模型:6大公司9大类别21个模型全回顾(三) Text-to-Text 模型 ChatGPT LaMDA P ...

Stanford CS224N: PyTorch Tutorial (Winter ‘21) —— 斯坦福CS224N PyTorch教程（第三部分）

演示：词窗分类一（Demo: Word Window Classification I）

数据(Data)

预处理(Preprocessing)

将单词转换为词嵌入

批处理语句(Batching Sentences)

Stanford CS224N: PyTorch Tutorial (Winter ‘21) —— 斯坦福CS224N PyTorch教程（第三部分）相关推荐

最新文章

热门文章

Stanford CS224N: PyTorch Tutorial (Winter ‘21) —— 斯坦福CS224N PyTorch教程 （第三部分）

演示：词窗分类 一（Demo: Word Window Classification I）

数据(Data)

预处理(Preprocessing)

将单词转换为词嵌入

批处理语句(Batching Sentences)

Stanford CS224N: PyTorch Tutorial (Winter ‘21) —— 斯坦福CS224N PyTorch教程 （第三部分）相关推荐

最新文章

热门文章

Stanford CS224N: PyTorch Tutorial (Winter ‘21) —— 斯坦福CS224N PyTorch教程（第三部分）

演示：词窗分类一（Demo: Word Window Classification I）

Stanford CS224N: PyTorch Tutorial (Winter ‘21) —— 斯坦福CS224N PyTorch教程（第三部分）相关推荐