#背景

来自GitHub上《tensorflow_cookbook》【https://github.com/nfmcclure/tensorflow_cookbook/tree/master/09_Recurrent_Neural_Networks】

Implementing an LSTM Model for Text Generation

We show how to implement a LSTM (Long Short Term Memory) RNN for Shakespeare language generation. (Word level vocabulary)

将展示如何为莎士比亚语言生成实现LSTM(长短期记忆)RNN。 (词汇​​词汇)

#代码

# Implementing an LSTM RNN Model
#------------------------------
#  Here we implement an LSTM model on all a data set of Shakespeare works.'''
We start by loading the necessary libraries and resetting the default computational graph.
我们首先加载必要的库并重置默认的计算图。
'''
import os
import re
import string
import requests
import numpy as np
import collections
import random
import pickle
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.python.framework import ops
ops.reset_default_graph()'''We start a computational graph session.'''
sess = tf.Session()'''
Next, it is important to set the algorithm and data processing parameters.
接下来,设置算法和数据处理参数很重要。Parameter : Descriptionsmin_word_freq: Only attempt to model words that appear at least 5 times. 仅尝试对出现至少5次的单词进行建模.
rnn_size: size of our RNN (equal to the embedding size) RNN大小(等于嵌入大小)
epochs: Number of epochs to cycle through the data
batch_size: How many examples to train on at once
learning_rate: The learning rate or the convergence paramter 学习率或收敛度参数
training_seq_len: The length of the surrounding word group (e.g. 10 = 5 on each side) 周围单词组的长度(例如每侧10 = 5)
embedding_size: Must be equal to the rnn_size
save_every: How often to save the model
eval_every: How often to evaluate the model
prime_texts: List of test sentences
'''
# Set RNN Parameters
min_word_freq = 5  # Trim the less frequent words off
rnn_size = 128  # RNN Model size
epochs = 10  # Number of epochs to cycle through data
batch_size = 100  # Train on this many examples at once
learning_rate = 0.001  # Learning rate
training_seq_len = 50  # how long of a word group to consider
embedding_size = rnn_size  # Word embedding size
save_every = 500  # How often to save model checkpoints
eval_every = 50  # How often to evaluate the test sentences
prime_texts = ['thou art more', 'to be or not to', 'wherefore art thou']# Download/store Shakespeare data
data_dir = 'temp'
data_file = 'shakespeare.txt'
model_path = 'shakespeare_model'
full_model_dir = os.path.join(data_dir, model_path)# Declare punctuation to remove, everything except hyphens and apostrophes
# 声明标点符号以删除除连字符和撇号之外的所有内容
punctuation = string.punctuation
punctuation = ''.join([x for x in punctuation if x not in ['-', "'"]])# Make Model Directory
if not os.path.exists(full_model_dir):os.makedirs(full_model_dir) '''用于递归创建目录。'''# Make data directory
if not os.path.exists(data_dir):os.makedirs(data_dir)'''
Download the data if we don't have it saved already. The data comes from the Gutenberg Project
'''
print('Loading Shakespeare Data')
# Check if file is downloaded.
if not os.path.isfile(os.path.join(data_dir, data_file)):print('Not found, downloading Shakespeare texts from www.gutenberg.org')shakespeare_url = 'http://www.gutenberg.org/cache/epub/100/pg100.txt'# Get Shakespeare textresponse = requests.get(shakespeare_url)shakespeare_file = response.content# Decode binary into strings_text = shakespeare_file.decode('utf-8')# Drop first few descriptive paragraphs.s_text = s_text[7675:]# Remove newliness_text = s_text.replace('\r\n', '')s_text = s_text.replace('\n', '')# Write to filewith open(os.path.join(data_dir, data_file), 'w') as out_conn:out_conn.write(s_text)
else:# If file has been saved, load from that filewith open(os.path.join(data_dir, data_file), 'r') as file_conn:s_text = file_conn.read().replace('\n', '')# Clean text
print('Cleaning Text')
s_text = re.sub(r'[{}]'.format(punctuation), ' ', s_text)
s_text = re.sub('\s+', ' ', s_text).strip().lower()
'''
运行结果:
Loading Shakespeare Data
Cleaning Text
Done loading/cleaning.
''''''
Define a function to build a word processing dictionary (word -> ix)
定义一个函数来构建一个文字处理字典(word - > ix)
'''
# Build word vocabulary function
def build_vocab(text, min_freq):word_counts = collections.Counter(text.split(' '))# limit word counts to those more frequent than cutoff# 将字数限制为比截止频率更频繁的字数word_counts = {key: val for key, val in word_counts.items() if val > min_freq}# Create vocab --> index mappingwords = word_counts.keys()vocab_to_ix_dict = {key: (i_x+1) for i_x, key in enumerate(words)}# Add unknown key --> 0 indexvocab_to_ix_dict['unknown'] = 0# Create index --> vocab mappingix_to_vocab_dict = {val: key for key, val in vocab_to_ix_dict.items()}return ix_to_vocab_dict, vocab_to_ix_dict'''
Now we can build the index-vocabulary from the Shakespeare data.
现在我们可以从莎士比亚数据中构建索引词汇表了。
'''
# Build Shakespeare vocabulary
print('Building Shakespeare Vocab')
ix2vocab, vocab2ix = build_vocab(s_text, min_word_freq)
vocab_size = len(ix2vocab) + 1
print('Vocabulary Length = {}'.format(vocab_size))
# Sanity Check
# 完整性检查
assert(len(ix2vocab) == len(vocab2ix))# Convert text to word vectors
s_text_words = s_text.split(' ')
s_text_ix = []
for ix, x in enumerate(s_text_words):try:s_text_ix.append(vocab2ix[x])except KeyError:s_text_ix.append(0)
s_text_ix = np.array(s_text_ix)
'''
运行结果:
Building Shakespeare Vocab
Vocabulary Length = 8009
''''''We define the LSTM model. The methods of interest are the __init__() method,
which defines all the model variables and operations,
and the sample() method which takes in a sample word and loops through to generate text.
我们定义LSTM模型。方法特定是__init __()方法[*1],它定义所有模型变量和操作,以及sample()方法,它接收一个样本字并循环生成文本。
'''
# Define LSTM RNN Model
class LSTM_Model():def __init__(self, embedding_size, rnn_size, batch_size, learning_rate,training_seq_len, vocab_size, infer_sample=False):self.embedding_size = embedding_sizeself.rnn_size = rnn_sizeself.vocab_size = vocab_sizeself.infer_sample = infer_sampleself.learning_rate = learning_rateif infer_sample:self.batch_size = 1self.training_seq_len = 1else:self.batch_size = batch_sizeself.training_seq_len = training_seq_lenself.lstm_cell = tf.contrib.rnn.BasicLSTMCell(self.rnn_size)self.initial_state = self.lstm_cell.zero_state(self.batch_size, tf.float32)self.x_data = tf.placeholder(tf.int32, [self.batch_size, self.training_seq_len])self.y_output = tf.placeholder(tf.int32, [self.batch_size, self.training_seq_len])with tf.variable_scope('lstm_vars'):# Softmax Output WeightsW = tf.get_variable('W', [self.rnn_size, self.vocab_size], tf.float32, tf.random_normal_initializer())b = tf.get_variable('b', [self.vocab_size], tf.float32, tf.constant_initializer(0.0))# Define Embedding# 定义嵌入embedding_mat = tf.get_variable('embedding_mat', [self.vocab_size, self.embedding_size],tf.float32, tf.random_normal_initializer())embedding_output = tf.nn.embedding_lookup(embedding_mat, self.x_data)rnn_inputs = tf.split(axis=1, num_or_size_splits=self.training_seq_len, value=embedding_output)rnn_inputs_trimmed = [tf.squeeze(x, [1]) for x in rnn_inputs]# If we are inferring (generating text), we add a 'loop' function# 如果我们推断(生成文本),我们添加一个'循环'函数# Define how to get the i+1 th input from the i th output# 定义如何从第i个输出获得第i + 1个输入def inferred_loop(prev):# Apply hidden layerprev_transformed = tf.matmul(prev, W) + b# Get the index of the output (also don't run the gradient)prev_symbol = tf.stop_gradient(tf.argmax(prev_transformed, 1)) '''*2'''# Get embedded vectorout = tf.nn.embedding_lookup(embedding_mat, prev_symbol)return outdecoder = tf.contrib.legacy_seq2seq.rnn_decoderoutputs, last_state = decoder(rnn_inputs_trimmed,self.initial_state,self.lstm_cell,loop_function=inferred_loop if infer_sample else None)# Non inferred outputsoutput = tf.reshape(tf.concat(axis=1, values=outputs), [-1, self.rnn_size])# Logits and outputself.logit_output = tf.matmul(output, W) + bself.model_output = tf.nn.softmax(self.logit_output)loss_fun = tf.contrib.legacy_seq2seq.sequence_loss_by_exampleloss = loss_fun([self.logit_output], [tf.reshape(self.y_output, [-1])],[tf.ones([self.batch_size * self.training_seq_len])])self.cost = tf.reduce_sum(loss) / (self.batch_size * self.training_seq_len)self.final_state = last_stategradients, _ = tf.clip_by_global_norm(tf.gradients(self.cost, tf.trainable_variables()), 4.5)optimizer = tf.train.AdamOptimizer(self.learning_rate)self.train_op = optimizer.apply_gradients(zip(gradients, tf.trainable_variables()))def sample(self, sess, words=ix2vocab, vocab=vocab2ix, num=10, prime_text='thou art'):state = sess.run(self.lstm_cell.zero_state(1, tf.float32))word_list = prime_text.split()for word in word_list[:-1]:x = np.zeros((1, 1))x[0, 0] = vocab[word]feed_dict = {self.x_data: x, self.initial_state: state}[state] = sess.run([self.final_state], feed_dict=feed_dict)out_sentence = prime_textword = word_list[-1]for n in range(num):x = np.zeros((1, 1))x[0, 0] = vocab[word]feed_dict = {self.x_data: x, self.initial_state: state}[model_output, state] = sess.run([self.model_output, self.final_state], feed_dict=feed_dict)sample = np.argmax(model_output[0])if sample == 0:breakword = words[sample]out_sentence = out_sentence + ' ' + wordreturn (out_sentence)'''In order to use the same model (with the same trained variables), we need to share the variable scope between the trained model and the test model.
为了使用相同的模型(使用相同的训练变量),我们需要在训练模型和测试模型之间共享变量范围。
'''
# Define LSTM Model
lstm_model = LSTM_Model(embedding_size, rnn_size, batch_size, learning_rate,training_seq_len, vocab_size)# Tell TensorFlow we are reusing the scope for the testing
# 告诉TensorFlow我们正在重复使用测试范围
with tf.variable_scope(tf.get_variable_scope(), reuse=True):test_lstm_model = LSTM_Model(embedding_size, rnn_size, batch_size, learning_rate,training_seq_len, vocab_size, infer_sample=True)'''
We need to save the model, so we create a model saving operation.
我们需要保存模型,因此我们创建了模型保存操作。
'''
# Create model saver
saver = tf.train.Saver(tf.global_variables())'''
Let's calculate how many batches are needed for each epoch and split up the data accordingly.
让我们计算每个时期需要多少批次并相应地分割数据。
'''
# Create batches for each epoch
num_batches = int(len(s_text_ix)/(batch_size * training_seq_len)) + 1
# Split up text indices into subarrays, of equal size
batches = np.array_split(s_text_ix, num_batches)
# Reshape each split into [batch_size, training_seq_len]
batches = [np.resize(x, [batch_size, training_seq_len]) for x in batches]# Initialize all variables
init = tf.global_variables_initializer()
sess.run(init)'''Training the model!'''
# Train model
train_loss = []
iteration_count = 1
for epoch in range(epochs):# Shuffle word indicesrandom.shuffle(batches)# Create targets from shuffled batchestargets = [np.roll(x, -1, axis=1) for x in batches]# Run a through one epochprint('Starting Epoch #{} of {}.'.format(epoch+1, epochs))# Reset initial LSTM state every epochstate = sess.run(lstm_model.initial_state)for ix, batch in enumerate(batches):training_dict = {lstm_model.x_data: batch, lstm_model.y_output: targets[ix]}c, h = lstm_model.initial_statetraining_dict[c] = state.ctraining_dict[h] = state.htemp_loss, state, _ = sess.run([lstm_model.cost, lstm_model.final_state, lstm_model.train_op],feed_dict=training_dict)train_loss.append(temp_loss)# Print status every 10 gensif iteration_count % 10 == 0:summary_nums = (iteration_count, epoch+1, ix+1, num_batches+1, temp_loss)print('Iteration: {}, Epoch: {}, Batch: {} out of {}, Loss: {:.2f}'.format(*summary_nums))# Save the model and the vocabif iteration_count % save_every == 0:# Save modelmodel_file_name = os.path.join(full_model_dir, 'model')saver.save(sess, model_file_name, global_step=iteration_count)print('Model Saved To: {}'.format(model_file_name))# Save vocabularydictionary_file = os.path.join(full_model_dir, 'vocab.pkl')with open(dictionary_file, 'wb') as dict_file_conn:pickle.dump([vocab2ix, ix2vocab], dict_file_conn)if iteration_count % eval_every == 0:for sample in prime_texts:print(test_lstm_model.sample(sess, ix2vocab, vocab2ix, num=10, prime_text=sample))iteration_count += 1# Plot loss over time
plt.plot(train_loss, 'k-')
plt.title('Sequence to Sequence Loss')
plt.xlabel('Generation')
plt.ylabel('Loss')
plt.show()
'''
运行结果:
Starting Epoch #1 of 10.
Iteration: 10, Epoch: 1, Batch: 10 out of 182, Loss: 9.90
'''

作出图:

注释:

*1:__init__()

面向对象最重要的概念就是类(Class)和实例(Instance),必须牢记类是抽象的模板,比如Student类,而实例是根据类创建出来的一个个具体的“对象”,每个对象都拥有相同的方法,但各自的数据可能不同。

仍以Student类为例,在Python中,定义类是通过class关键字:

class Student(object):pass

class后面紧接着是类名,即Student,类名通常是大写开头的单词,紧接着是(object),表示该类是从哪个类继承下来的,继承的概念我们后面再讲,通常,如果没有合适的继承类,就使用object类,这是所有类最终都会继承的类。

定义好了Student类,就可以根据Student类创建出Student的实例,创建实例是通过类名+()实现的:

>>> bart = Student()
>>> bart
<__main__.Student object at 0x10a67a590>
>>> Student
<class '__main__.Student'>

可以看到,变量bart指向的就是一个Student的实例,后面的0x10a67a590是内存地址,每个object的地址都不一样,而Student本身则是一个类。

可以自由地给一个实例变量绑定属性,比如,给实例bart绑定一个name属性:

>>> bart.name = 'Bart Simpson'
>>> bart.name
'Bart Simpson'

由于类可以起到模板的作用,因此,可以在创建实例的时候,把一些我们认为必须绑定的属性强制填写进去。通过定义一个特殊的__init__方法,在创建实例的时候,就把namescore等属性绑上去:

class Student(object):def __init__(self, name, score):self.name = nameself.score = score

注意:特殊方法“__init__”前后分别有两个下划线!!!

注意到__init__方法的第一个参数永远是self,表示创建的实例本身,因此,在__init__方法内部,就可以把各种属性绑定到self,因为self就指向创建的实例本身。

有了__init__方法,在创建实例的时候,就不能传入空的参数了,必须传入与__init__方法匹配的参数,但self不需要传,Python解释器自己会把实例变量传进去:

>>> bart = Student('Bart Simpson', 59)
>>> bart.name
'Bart Simpson'
>>> bart.score
59

和普通的函数相比,在类中定义的函数只有一点不同,就是第一个参数永远是实例变量self,并且,调用时,不用传递该参数。除此之外,类的方法和普通函数没有什么区别,所以,你仍然可以用默认参数、可变参数、关键字参数和命名关键字参数。

【https://www.liaoxuefeng.com/wiki/0014316089557264a6b348958f449949df42a6d3a2e542c000/001431864715651c99511036d884cf1b399e65ae0d27f7e000】

*2:tf.stop_gradient

tf.stop_gradient(input,name=None
)

停止梯度计算.

在图形中执行时,此操作按原样输出其输入张量.

在构建计算梯度的操作时,这个操作会阻止将其输入的贡献考虑在内.通常情况下,梯度生成器将操作添加到图形中,通过递归地查找有助于其计算的输入来计算指定“损失”的导数.如果在图形中插入此操作,则它的输入将从梯度生成器中屏蔽.计算梯度时不考虑它们.

当你想用 TensorFlow 计算一个值时,这是很有用的,但是需要假设这个值是一个常量.一些例子包括:

  • 该 EM 算法,其中 M-step 不应该通过输出涉及反向 E-step.
  • 波尔兹曼(Boltzmann)机器的对比分歧 training,当区分能量函数时,training 不能通过反向传播(backpropagate)从模型中生成样本的图表.
  • 对抗性的 training,其中不应通过反例生成过程发生 backprop.

{ 函数参数:

  • input:A Tensor.
  • name:操作的名称(可选).

函数返回值:

一Tensor.与.类型相同input. }


(Placeholder)

深度学习之LSTM案例分析(二)相关推荐

  1. 深度学习之LSTM案例分析(三)

    #背景 来自GitHub上<tensorflow_cookbook>[https://github.com/nfmcclure/tensorflow_cookbook/tree/maste ...

  2. 免费教材丨第56期:《深度学习导论及案例分析》、《谷歌黑板报-数学之美》

    小编说  离春节更近了!  本期教材        本期为大家发放的教材为:<深度学习导论及案例分析>.<谷歌黑板报-数学之美>两本书,大家可以根据自己的需要阅读哦! < ...

  3. 《深度学习导论及案例分析》一2.11概率图模型的推理

    本节书摘来自华章出版社<深度学习导论及案例分析>一书中的第2章,第2.11节,作者李玉鑑 张婷,更多章节内容可以访问云栖社区"华章计算机"公众号查看. 2.11概率图模 ...

  4. 《深度学习导论及案例分析》一导读

    PREFACE 前言 "深度学习"一词大家已经不陌生了,随着在不同领域取得了超越其他方法的成功,深度学习在学术界和工业界掀起了一次神经网络发展史上的新浪潮.运用深度学习解决实际问题 ...

  5. 深度学习在工业推荐如何work?Netflix这篇论文「深度学习推荐系统Netflix案例分析」阐述DL在RS的优劣与经验教训...

    来源:专知 深度学习在推荐系统中如何发挥作用是一个重要的问题.最近来自Netflix的文章详细阐述了这一点指出:在建模用户物品交互方面,深度学习相比传统基线方法并无太大优势,而对于异质特征的表示融入深 ...

  6. 【深度学习】LSTM神经网络解决COVID-19预测问题(二)

    [深度学习]LSTM神经网络解决COVID-19预测问题(二) 文章目录 1 概述 2 模型求解和检验 3 模型代码 4 模型评价与推广 5 参考 1 概述 建立一个普适性较高的模型来有效预测疫情的达 ...

  7. 阿里Java学习路线:阶段 1:Java语言基础-Java面向对象编程:第21章:抽象类与接口应用:课时94:案例分析二(绘图处理)

    案例分析二 考虑一个表示绘图的标准,并且可以根据不同的图形来进行绘制: interface IGraphical { // 定义绘图标准public void paint() ; // 绘图 } cl ...

  8. 深度学习算法 | LSTM算法原理简介及Tutorial

    北京 | 深度学习与人工智能研修 12月23-24日 再设经典课程  重温深度学习 阅读全文 > 正文共4880个字 17张图,预计阅读时间:13分钟. 1.背景 LSTM(Long Short ...

  9. 【C4】基于深度学习的心电信号分析

    ★★★ 本文源自AI Studio社区精品项目,[点击此处]查看更多精品内容 >>> 基于深度学习的心电信号分析 一.项目背景 近年来,随着人工智能和算法的发展,以机器学习和深度学习 ...

最新文章

  1. 如何安装新linux内核,详解Debian系统中安装Linux新内核的流程
  2. mqtt session保持 订阅消息_如何使用 MQTT 报文实现发布订阅功能
  3. 【笔记】用正则匹配字符串的方法摘抄
  4. 选中内容_Excel – 选中的单元格自动显示在A1,报表演示数据再多也能看清
  5. 【华为云技术分享】揭秘华为云DLI背后的核心计算引擎
  6. centos php 绑定域名,centos如何绑定域名?
  7. 用信号量实现进程互斥示例和解决哲学家就餐问题
  8. 南京高中计算机老师,正高级教师、江苏省高中信息技术特级教师——巫雪琴
  9. 多版本并发控制MVCC和乐观锁OCC 是什么 区别
  10. 美团下拉菜单html5,jQuery vue仿美团订餐系统分类菜单切换代码
  11. 电气工程及自动化 (独立本科) 自考
  12. 博主自传——蒟蒻的OI之路
  13. Qzone 超级补丁热修复方案原理
  14. JAVA北京时间转换为世界协调时
  15. 利用 node.js 云函数解密获取微信小程序的手机号码等加密信息 encryptedData 的内容。
  16. MFC控件绘制透明png图片或者半透明图片
  17. Java实现 LeetCode 54 螺旋矩阵
  18. PS 基础操作学习讲解帖列表
  19. 如何提高系统稳定性?
  20. 如何在 Raspberry Pi 上使用 Brother 打印机

热门文章

  1. 基于OpenCV的简单机读卡识别
  2. 海昇智 :拼多多直通车开车需要分时间吗?
  3. 基于SaaS的教务系统平台设计构想
  4. react 逆地理 高德地图_react高德地图默认卫星图设置
  5. 直播+迎来重磅炸弹,网易推出音乐+直播服务look直播,直播+是大趋势
  6. 直播入口地址的数据库修改
  7. android中按钮凹下去,实现按钮的点击效果
  8. c语言if语句知识点总结,c语言中if语句知识点总结.docx
  9. 骁龙778g4g和5g区别
  10. web调用摄像头拍照并上传到服务器