python实现豆瓣电影评价感情分析

先上图:（资源链接蓝奏云：https://zyjblogs.lanzous.com/iGjjfe2jyaj）

1.词云图

2.评价星级饼图

3.简报（好评率，最好评价，最差评价）

最好评价：很好看的！剧情有倒叙说看不懂的往下看就行！任嘉伦演技很厉害了，那岚岳林敬两个人很容易分开，演可爱也是一点不尴尬就是很不错！张慧雯长的也挺可爱的~关于剧情倒叙很多伏笔很多，作为一个原创剧本我个人很满意！每个人都有自己的小心思小秘密，需要观众一点一点揣摩~没有绝对坏人，对于明尊我也不是很讨厌，可能因为演技太好有点被林源圈粉！先夸一夸实景拍摄！真的太美了，很久没见到几乎全实景的武侠剧了！！！包括有一场捅马蜂窝的戏都是真实拍摄真的好开心能看到这样一部良心剧！！剧情不拖沓不注水不加戏，人设鲜明每个人都有私心但又都能看到可怜的一面，你看到的感情线很多都是互相利用，太高能了！男女主有仇，看上去套路性但其实这个仇根本不影响他们的感情，林若寒也是很支持林敬追求真爱不希望上辈子的仇给下一辈留下，是很好的母亲
最差评价：辣鸡片子，一点也不好看
好评率：68.4%

二、代码部分

import requests
from bs4 import  BeautifulSoup
import traceback
import csv
import  jieba
import  csv
from wordcloud import WordCloud
import  numpy as np
from PIL import Image
import snownlp
import matplotlib.pyplot as plt
from matplotlib.font_manager import FontProperties
#定义请求每页影评的方法
header = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',"Connection": "keep - alive","User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36","Cookie": 'bid=HPwx786ji5w; douban-fav-remind=1; viewed="22601258"; gr_user_id=954bdfba-9778-4359-b238-cd539123a160; _vwo_uuid_v2=D7DD1B1011AD0B9B5B3332525CEEF25CF|9b95f719e9255f99462f09e1248197a2; __utmz=223695111.1592467854.1.1.utmcsr=baidu|utmccn=(organic)|utmcmd=organic; ll="118254"; __utmc=30149280; __utmc=223695111; ap_v=0,6.0; __utma=30149280.349582947.1592100671.1592986456.1592989460.7; __utmz=30149280.1592989460.7.5.utmcsr=baidu|utmccn=(organic)|utmcmd=organic; _pk_ref.100001.4cf6=%5B%22%22%2C%22%22%2C1592989707%2C%22https%3A%2F%2Fwww.baidu.com%2Flink%3Furl%3D_7ZcG2FOcVjzEmAtnon0r-2-zpQeowzBEOKVYuJSrfmg_SLF6-lCeZXNH6BtW6ig%26wd%3D%26eqid%3Da740e3ec00010995000000065eeb1ef0%22%5D; _pk_ses.100001.4cf6=*; __utma=223695111.1175335955.1592467854.1592986456.1592989707.4; __utmb=223695111.0.10.1592989707; __utmb=30149280.5.10.1592989460; _pk_id.100001.4cf6=263156288f9d7135.1592467854.4.1592992247.1592986526.'
}
rating = [0, 0, 0, 0, 0]
AllRating = 0
star_List= ['很差','较差','还行','推荐','力荐']def getCommentByPage(url,commentList):#4.添加了请求头的请求response=requests.get(url,headers=header)if response.status_code==200:bs=BeautifulSoup(response.content,"html5lib")commentItemList=bs.select(".comment-item")try:for commentItem in commentItemList:#print(commentItem)comment =commentItem.select_one(".comment")commentInfo = comment.select_one(".comment-info")#获取评议人auther = commentInfo.select_one("a").text#print(auther)#获取打分star=commentInfo.select_one(".rating")if(star!=None):star1=star.get('title')#print(star1)for i in range(0,5):if(star1 == star_List[i]):rating[i] += 1commentContent=comment.select_one(".short").text.replace("\n","")if not star==None:#print(auther,"---",star['title'],"----",commentContent)commentList.append([auther,star['title'],commentContent])return commentListexcept Exception:#打印异常信息print(traceback.print_exec())passdef readData():commentList = []with open(f"{name}.csv", 'r', encoding="utf-8") as file:csvReader = csv.reader(file)#print(csvReader)#遍历迭代#4.使用列表生成式return [item[2] for item in csvReader]passdef generateWordCloud():commentList = readData()finalComment = ""k = 0m = 0# 加载停止词典stop_words = [w.strip() for w in open('cn_stopwords.txt', encoding="utf-8").readlines()]max = snownlp.SnowNLP(commentList[0]).sentimentsmaxtag = commentList[0]min = snownlp.SnowNLP(commentList[0]).sentimentsmintag = commentList[0]for comment in commentList:# 如果不在停止此表中加入结果集if comment not in stop_words:finalComment+=comments = snownlp.SnowNLP(comment)# 进行对没条评论情感分析打分累加k = k + s.sentiments# 对评论总数进行累加m = m + 1if max < s.sentiments:max = s.sentimentsmaxtag = commentif min > s.sentiments:min = s.sentimentsmintag = commentf = open(name+".txt", "w",encoding="utf-8")str1 =maxtag + "\n"str1 =str1+"最差评价：" + mintag + "\n"str1 =str1+"好评率："+str(round(k / m, 3)*100)+"%" +"\n"f.write(str1)finalComment=" ".join(jieba.cut(finalComment))#自定义词云轮廓image =np.array(Image.open("1.png"))#4、生成词云#font_path字体路径#background_color背景颜色#mask:自定义图片最为生成慈云的轮廓wordCloud = WordCloud(font_path="YaHeiMonacoHybrid.ttf",background_color="white",mask=image).generate(finalComment)#保存生成本地词云wordCloud.to_file(f"{name}.png")def generatePie():try:for i in range(0, 5):AllRating = AllRating + rating[i]for i in range(0, 5):rating[i] = rating[i]/AllRatingexcept Exception:passfont = FontProperties(fname='YaHeiMonacoHybrid.ttf', size=16)plt.pie(x=rating,labels=['1','2','3','4','5'],colors = ['red','pink','blue','purple','orange'],startangle=90,shadow=True,#explode=tuple(indic),  # tuple方法用于将列表转化为元组autopct='%1.1f%%'  # 是数字1，不是l)plt.title(u'好评分析', FontProperties=font)plt.savefig(name+"_饼图.jpg")plt.show()if __name__ == '__main__':commentList=[]url = input('请输入要分析 电影的id:(例子：https://movie.douban.com/subject/30425206中30425206)')#print("ID---------star----------评价\n")name=""for i in range(10):#baseUrl = f"https://movie.douban.com/subject/30425206/comments?start={i * 20}"baseUrl = f"https://movie.douban.com/subject/{url}/comments?start={i * 20}"response = requests.get(baseUrl, headers=header)if response.status_code == 200:bs = BeautifulSoup(response.content, "html5lib")name = bs.title.textname = name.strip()commentList=getCommentByPage(baseUrl,commentList)with open(f"{name}.csv", 'w', newline="", encoding="utf-8") as file:csvWriter = csv.writer(file)#print(commentList)csvWriter.writerows(commentList)f = open(name+".txt", "w",encoding="utf-8")str1 =name +"\n" +"最好评价："f.write(str1)generateWordCloud()generatePie()print("分析完成")

三、停止词典等资源如下：

蓝奏云：https://zyjblogs.lanzous.com/iGjjfe2jyaj

python实现豆瓣电影评价感情分析相关推荐

python爬虫豆瓣电影评价_Python 爬虫实战（1）：分析豆瓣中最新电影的影评
目标总览主要做了三件事: 抓取网页数据清理数据用词云进行展示使用的python版本是3.6 一.抓取网页数据第一步要对网页进行访问,python中使用的是urllib库.代码如下: from ...
python爬虫豆瓣电影评价_使用爬虫爬取豆瓣电影影评数据Python版
在使用爬虫爬取豆瓣电影影评数据Java版一文中已详细讲解了爬虫的实现细节,本篇仅为展示Python版本爬虫实现,所以直接上代码完整代码爬虫主程序 # 爬虫启动入口 from C02.data ...
python pandas 豆瓣电影 top250 数据分析
python pandas 豆瓣电影 top250 数据分析豆瓣电影top250数据分析数据来源(豆瓣电影top250) 爬虫代码比较简单数据较为真实,可以进行初步的数据分析可以将前面的几篇文 ...
Python自定义豆瓣电影种类，排行，点评的爬取与存储（进阶上）
Python 2.7 IDE Pycharm 5.0.3 Firefox 47.0.1 具体Selenium及PhantomJS请看Python+Selenium+PIL+Tesseract真正自 ...
python爬虫—豆瓣电影海报（按类别）
原文地址:http://www.alannah.cn/2019/04/06/getdouban/ python爬虫-豆瓣电影海报目标:通过python爬虫在豆瓣电影上按类别对电影海报等数据进行抓取, ...
Python爬虫实战，pyecharts模块，Python实现豆瓣电影TOP250数据可视化
前言利用Python实现豆瓣电影TOP250数据可视化.废话不多说. 让我们愉快地开始吧~ 开发工具 Python版本: 3.6.4 相关模块: pandas模块 pyecharts模块: 以及一些 ...
Python自定义豆瓣电影种类，排行，点评的爬取与存储（高阶上）
Python 2.7 IDE Pycharm 5.0.3 Firefox 47.0.1 豆瓣电影系列: - 基础抓取(限于"豆瓣高分"选项电影及评论)请看↓ Python自定义豆瓣 ...
Python自定义豆瓣电影种类，排行，点评的爬取与存储（进阶下）
Python 2.7 IDE Pycharm 5.0.3 Firefox 47.0.1 如有兴趣可以从如下几个开始看起,其中有我遇到的很多问题: 基础抓取(限于"豆瓣高分"选项电影 ...
Python：豆瓣电影商业数据分析-爬取全数据【附带爬虫豆瓣，数据处理过程，数据分析，可视化，以及完整PPT报告】
**爬取豆瓣电影信息,分析近年电影行业的发展情况** 本文是完整的数据分析展现,代码有完整版,包含豆瓣电影爬取的具体方式[附带爬虫豆瓣,数据处理过程,数据分析,可视化,以及完整PPT报告] 最近MBA ...

python实现豆瓣电影评价感情分析

python实现豆瓣电影评价感情分析相关推荐

最新文章

热门文章