参考链接:https://blog.csdn.net/huacha__/article/details/80652384

Keras建立RNN模型进行IMDb情感分析的Python代码

IMDb数据集共有50000项“影评文字”,分为训练数据与测试数据各25000项,每一项“影评文字”都被标记为“正面评价”或“负面评价”。我们希望能建立一个模型,经过大量“影评文字”训练后,此模型可以用于预测“影评文字”是“正面评价”或“负面评价”。

#####数据准备
from keras.datasets import imdb
from keras.preprocessing import sequence
from keras.preprocessing.text import Tokenizerimport re
re_tag = re.compile(r'<[^>]+>')def rm_tags(text):return re_tag.sub('', text)import os
def read_files(filetype):path = "data/aclImdb/"file_list=[]positive_path=path + filetype+"/pos/"for f in os.listdir(positive_path):file_list+=[positive_path+f]negative_path=path + filetype+"/neg/"for f in os.listdir(negative_path):file_list+=[negative_path+f]print('read',filetype, 'files:',len(file_list))all_labels = ([1] * 12500 + [0] * 12500) all_texts  = []for fi in file_list:with open(fi,encoding='utf8') as file_input:all_texts += [rm_tags(" ".join(file_input.readlines()))]return all_labels,all_textsy_train,train_text=read_files("train")
y_test,test_text=read_files("test")
token = Tokenizer(num_words=3800)
token.fit_on_texts(train_text)
x_train_seq = token.texts_to_sequences(train_text)
x_test_seq  = token.texts_to_sequences(test_text)
x_train = sequence.pad_sequences(x_train_seq, maxlen=380)
x_test  = sequence.pad_sequences(x_test_seq,  maxlen=380)#####建立模型
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import SimpleRNNmodel = Sequential()model.add(Embedding(output_dim=32,input_dim=3800, input_length=380))
model.add(Dropout(0.35))model.add(SimpleRNN(units=16))model.add(Dense(units=256,activation='relu' ))model.add(Dropout(0.35))model.add(Dense(units=1,activation='sigmoid' ))model.summary()#####训练模型
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])train_history =model.fit(x_train, y_train,batch_size=100, epochs=10,verbose=2,validation_split=0.2)import matplotlib.pyplot as plt
def show_train_history(train_history,train,validation):plt.plot(train_history.history[train])plt.plot(train_history.history[validation])plt.title('Train History')plt.ylabel(train)plt.xlabel('Epoch')plt.legend(['train', 'validation'], loc='upper left')plt.show()show_train_history(train_history,'acc','val_acc')
show_train_history(train_history,'loss','val_loss')#####评估模型的准确率
scores = model.evaluate(x_test, y_test, verbose=1)
scores[1]#####预测概率
probility=model.predict(x_test)
probility[:10]for p in probility[12500:12510]:print(p)#####预测结果
predict=model.predict_classes(x_test)predict[:10]
predict.shape
predict_classes=predict.reshape(25000)
predict_classes#####查看预测结果
SentimentDict={1:'正面的',0:'负面的'}
def display_test_Sentiment(i):print(test_text[i])print('label真实值:',SentimentDict[y_test[i]],'预测结果:',SentimentDict[predict_classes[i]])display_test_Sentiment(2)
'''
注:以下是程序输出(不包括此句)
As a recreational golfer with some knowledge of the sport's history, I was pleased with Disney's sensitivity to the issues of class in golf in the early twentieth century. The movie depicted well the psychological battles that Harry Vardon fought within himself, from his childhood trauma of being evicted to his own inability to break that glass ceiling that prevents him from being accepted as an equal in English golf society. Likewise, the young Ouimet goes through his own class struggles, being a mere caddie in the eyes of the upper crust Americans who scoff at his attempts to rise above his standing. What I loved best, however, is how this theme of class is manifested in the characters of Ouimet's parents. His father is a working-class drone who sees the value of hard work but is intimidated by the upper class; his mother, however, recognizes her son's talent and desire and encourages him to pursue his dream of competing against those who think he is inferior.Finally, the golf scenes are well photographed. Although the course used in the movie was not the actual site of the historical tournament, the little liberties taken by Disney do not detract from the beauty of the film. There's one little Disney moment at the pool table; otherwise, the viewer does not really think Disney. The ending, as in "Miracle," is not some Disney creation, but one that only human history could have written.
label真实值: 正面的 预测结果: 正面的
'''display_test_Sentiment(3)
'''
注:以下是程序输出(不包括此句)
I saw this film in a sneak preview, and it is delightful. The cinematography is unusually creative, the acting is good, and the story is fabulous. If this movie does not do well, it won't be because it doesn't deserve to. Before this film, I didn't realize how charming Shia Lebouf could be. He does a marvelous, self-contained, job as the lead. There's something incredibly sweet about him, and it makes the movie even better. The other actors do a good job as well, and the film contains moments of really high suspense, more than one might expect from a movie about golf. Sports movies are a dime a dozen, but this one stands out. This is one I'd recommend to anyone.
label真实值: 正面的 预测结果: 正面的
'''
predict_classes[12500:12510]
'''
注:以下是程序输出(不包括此句)
array([0, 0, 0, 0, 1, 0, 0, 0, 0, 0])
'''
display_test_Sentiment(12502)
'''
注:以下是程序输出(不包括此句)
First of all I hate those moronic rappers, who could'nt act if they had a gun pressed against their foreheads. All they do is curse and shoot each other and acting like cliché'e version of gangsters.The movie doesn't take more than five minutes to explain what is going on before we're already at the warehouse There is not a single sympathetic character in this movie, except for the homeless guy, who is also the only one with half a brain.Bill Paxton and William Sadler are both hill billies and Sadlers character is just as much a villain as the gangsters. I did'nt like him right from the start.The movie is filled with pointless violence and Walter Hills specialty: people falling through windows with glass flying everywhere. There is pretty much no plot and it is a big problem when you root for no-one. Everybody dies, except from Paxton and the homeless guy and everybody get what they deserve.The only two black people that can act is the homeless guy and the junkie but they're actors by profession, not annoying ugly brain dead rappers.Stay away from this crap and watch 48 hours 1 and 2 instead. At lest they have characters you care about, a sense of humor and nothing but real actors in the cast.
label真实值: 负面的 预测结果: 负面的
'''#预测新的影评
input_text='''
I can't vote because I have not watched this movie yet. I've been wanting to watch this movie since the time they announced making it which is about 2 years ago (!)
I was planning to go with the family to see the anticipated movie but my nieces had school exams at the opening time so we all decided to wait for the next weekend. I was utterly shocked to learn yesterday that they pulled the movie from the Kuwaiti theaters "temporarily" so that the outrageous censorship system can remove some unwanted scenes.
The controversial gay "moment" according to my online research is barely there, so I can't find any logical reason for all the fuss that's been going on. And it was bad enough when fanatics and haters tried (in vain) to kill the movie with low ratings and negative reviews even before it was in the cinemas and I'm pretty sure most of those trolls never got the chance to watch the movie at that time.
Based on the trailers, I think the movie is very promising and entertaining and you can't simply overlook the tremendous efforts made to bring this beloved tale to life. To knock down hundreds of people's obvious hard work with unprofessional critique and negative reviews just for the sake of hatred is unfathomable. I hope people won't judge a movie before having the experience of watching it in the first place.
Impatiently waiting for the Kuwaiti cinemas to bring back the movie...
'''
input_seq = token.texts_to_sequences([input_text])
pad_input_seq  = sequence.pad_sequences(input_seq , maxlen=380)
predict_result=model.predict_classes(pad_input_seq)
SentimentDict[predict_result[0][0]]
'''
'负面的'
'''def predict_review(input_text):input_seq = token.texts_to_sequences([input_text])pad_input_seq  = sequence.pad_sequences(input_seq , maxlen=380)predict_result=model.predict_classes(pad_input_seq)print(SentimentDict[predict_result[0][0]])predict_review('''
As a fan of the original Disney film (Personally I feel it's their masterpiece) I was taken aback to the fact that a new version was in the making. Still excited I had high hopes for the film. Most of was shattered in the first 10 minutes. Campy acting with badly performed singing starts off a long journey holding hands with some of the worst CGI Hollywood have managed to but to screen in ages.
A film that is over 50% GCI, should focus on making that part believable, unfortunately for this film, it's far from that. It looks like the original film was ripped apart frame by frame and the beautiful hand-painted drawings have been replaced with digital caricatures. Besides CGI that is bad, it's mostly creepy. As the little teacup boy will give me nightmares for several nights to come. Emma Watson plays the same character as she always does, with very little acting effort and very little conviction as Belle. Although I can see why she was cast in the film based on merits, she is far from the right choice for the role. Dan Stevens does alright under as some motion captured dead-eyed Beast, but his performance feels flat as well. Luke Evans makes for a great pompous Gaston, but a character that has little depth doesn't really make for a great viewing experience. Josh Gad is a great comic relief just like the original movie's LeFou. Other than that, none of the cast stands out enough for me to remember them. Human or CHI creature. I was just bored through out the whole experience. And for a project costing $160 000 000, I can see why the PR department is pushing it so hard because they really need to get some cash back on this pile of wet stinky CGI-fur!
All and all, I might be bias from really loving Disney's first adaptation. That for me marks the high-point of all their work, perfectly combining the skills of their animators along with some CGI in a majestic blend. This film however is more like the bucket you wash off your paintbrush in, it has all the same colors, but muddled with water and to thin to make a captivating story from. The film is quite frankly not worth your time, you would be better off watching the original one more time.
''')
'''
注:以下是程序输出(不包括此句)
1/1 [==============================] - 0s
正面的
'''predict_review('''
The original Beauty and the Beast was my favorite cartoon as a kid but it did have major plot holes. Why had no one else ever seen the castle or knew where it was? Didn't anyone miss the people who were cursed? All of that gets an explanation when the enchantress places her curse in the beginning. Why did Belle and her Father move to a small town? Her mother died and the father thought it as best to leave. I love the new songs and added lyrics to the originals. I like the way the cgi beast looks (just the face is CGi). I think Emma Watson is a perfect Belle who is outspoken, fearless, and different. The set design is perfect for the era in France.
I know a lot of people disagree but I found this remake with all its changes to be more enchanting, beautiful, and complete than the original 1991 movie. To each his own but I think everyone should see it for themselves.
''')
'''
注:以下是程序输出(不包括此句)
1/1 [==============================] - 0s
正面的
'''

深度学习之RNN循环神经网络(理论+图解+Python代码部分)相关推荐

  1. July深度学习之RNN循环神经网络

    RNN循环神经网络 一.简介 首先,为什么有BP神经网络和CNN,还要提出RNN? 因为传统的神经网络,包括CNN,它的输入和输出是互相独立的.但有些时候,后续的输出和前面的内容是相关的.比如,我是中 ...

  2. 深度学习之RNN(循环神经网络)

    一 RNN概述 前面我们叙述了BP算法, CNN算法, 那么为什么还会有RNN呢?? 什么是RNN, 它到底有什么不同之处? RNN的主要应用领域有哪些呢?这些都是要讨论的问题. 1) BP算法,CN ...

  3. 深度学习 实验七 循环神经网络

    文章目录 深度学习 实验七 循环神经网络 一.问题描述 二.设计简要描述 三.程序清单 深度学习 实验七 循环神经网络 一.问题描述 之前见过的所以神经网络(比如全连接网络和卷积神经网络)都有一个主要 ...

  4. 《深度学习》之 循环神经网络 原理 超详解

    循环神经网络 一.研究背景 1933年,西班牙神经生物学家Rafael Lorente de Nó发现大脑皮层(cerebral cortex)的解剖结构允许刺激在神经回路中循环传递,并由此提出反响回 ...

  5. 「NLP」 深度学习NLP开篇-循环神经网络(RNN)

    https://www.toutiao.com/a6714260714988503564/ 从这篇文章开始,有三AI-NLP专栏就要进入深度学习了.本文会介绍自然语言处理早期标志性的特征提取工具-循环 ...

  6. 【NLP】 深度学习NLP开篇-循环神经网络(RNN)

    从这篇文章开始,有三AI-NLP专栏就要进入深度学习了.本文会介绍自然语言处理早期标志性的特征提取工具-循环神经网络(RNN).首先,会介绍RNN提出的由来:然后,详细介绍RNN的模型结构,前向传播和 ...

  7. [人工智能-深度学习-48]:循环神经网络 - RNN是循环神经网络还是递归神经网络?

    作者主页(文火冰糖的硅基工坊):文火冰糖(王文兵)的博客_文火冰糖的硅基工坊_CSDN博客 本文网址:https://blog.csdn.net/HiWangWenBing/article/detai ...

  8. 深度学习TensorFlow2,循环神经网络(RNN,LSTM)系列知识

    一:概述 二:时间序列 三:RNN 四:LSTM 一:概述 1.什么叫循环? 循环神经网络是一种不同于ResNet,VGG的网络结构,个人理解最大的特点就是:它通过权值共享,极大的减少了权值的参数量. ...

  9. 《动手学深度学习》task3_3 循环神经网络进阶

    目录 GRU GRU 重置门和更新门 候选隐藏状态 隐藏状态 GRU的实现 载入数据集 初始化参数 GRU模型 训练模型 简洁实现 LSTM 长短期记忆 输入门.遗忘门和输出门 候选记忆细胞 记忆细胞 ...

  10. 深度学习-Tensorflow2.2-RNN循环神经网络{11}-评论分类-25

    什么是RNN? 代码 import tensorflow as tf import matplotlib.pyplot as plt %matplotlib inline import numpy a ...

最新文章

  1. 如何在两个目录中删除其中一个目录中同名文件
  2. python删除中文停用词_python词云 wordcloud+jieba生成中文词云图
  3. centos 6.5安装mysql5.7,centos6.5安装mysql5.7
  4. LINUX内核之内存屏障
  5. rxjs fromEvent的用法
  6. window下遍历并修改文件
  7. linux运维服务常见故障,linux常见故障处理
  8. css怎麽将按钮变长,CSS3 按钮边线环绕渐变缩短和伸长
  9. MFC中控件的大小和位置自定义代码
  10. Android应用程序开发以及背后的设计思想深度剖析(3)
  11. 如何编辑pdf文件内容
  12. 【清华大学陈渝】第二章 启动、中断、异常和系统调用
  13. 软件破解逆向安全(十二)内存特征码
  14. 砸蛋程序php,基于JQuery+PHP编写砸金蛋中奖程序,jquery中奖_PHP教程
  15. atomic头文件编译_c++11 多线程(3)atomic 总结
  16. 智能快递柜 软件架构 linux,13.智能快递柜(对接流程)
  17. JAVA游戏土行孙_《封神榜》土行孙,被誉为国内最知名矮星,现惨淡靠低保度日...
  18. 路由器更换wan口及vlan配置
  19. 职场人士升职加薪必备的工作软件,总有一款适合你
  20. 学生学籍管理系统_学生登陆系统查询与修改信息

热门文章

  1. Atitit. c# 语法新特性 c#2.0 3.0 4.0 4.5 5.0 6.0 attilax总结 1. 版本历史 1 1.1. C# 1.0-纯粹的面向对象 2 1.2. C# 2.0
  2. paip.数组以及集合的操作uapi java php python总结..
  3. paip.提高用户体验----增添开始菜单类似360小助手按钮总结
  4. 提升用户体验---自动邮编提示与验证地址
  5. paip.url参数格式化.txt
  6. 快速开发字段很多的MIS表
  7. XX项目技术架构模板
  8. 英方软件:以“数据复制”为起点来赋能行业
  9. Rust: Rust Language Cheat Sheet,强烈推荐!
  10. 黑石集团(Black Stone)黑岩公司(Black Rock)