最近看了《终结者:黑暗命运》,之前也陆续看了几部电影,一时间觉得对电影着了迷,想要了解一下即将上映的电影有哪些,有没有自己感兴趣的,网上搜索了一下发现在豆瓣网里面就有专门的一块用来显示即将上映的电影清单,截图如下所示:

地址在这里。

顺便提一下,这里是2018年度电影榜单。

接下来就是具体的实现:

#!usr/bin/env python
#encoding:utf-8
from __future__ import division'''
__Author__:沂水寒城
功能: Python爬取豆瓣网中即将上映的电影数据清单
'''import sys
import urllib
from lxml import etree
import lxml.html as HTML  if sys.version_info==2:reload(sys)sys.setdefaultencoding("utf-8")def doubanComingMovieSpider():'''获取天气数据(无实时天气数据)'''res_list=[]page_html=urllib.urlopen('https://movie.douban.com/coming').read()print 'html_length: ', len(page_html)hdoc=etree.HTML(page_html)htree=etree.ElementTree(hdoc)for i in range(1,80):try:times=htree.xpath('//*[@id="content"]/div/div[1]/table/tbody/tr['+str(i)+']/td[1]/text()')name=htree.xpath('//*[@id="content"]/div/div[1]/table/tbody/tr['+str(i)+']/td[2]/a/text()')types=htree.xpath('//*[@id="content"]/div/div[1]/table/tbody/tr['+str(i)+']/td[3]/text()')zone=htree.xpath('//*[@id="content"]/div/div[1]/table/tbody/tr['+str(i)+']/td[4]/text()')number=htree.xpath('//*[@id="content"]/div/div[1]/table/tbody/tr['+str(i)+']/td[5]/text()')print 'times: ',timesprint 'name: ',nameprint 'types: ',typesprint 'zone: ',zoneprint 'number: ',numberexcept:passif __name__ == '__main__':doubanComingMovieSpider()

实现非常地简单,没有很复杂的地方,结果输出如下:

html_length:  55682
times:  [u'\n                11\u670807\u65e5\n\n            \n            ']
name:  [u'\u8d8a\u57df\u91cd\u751f']
types:  [u'\n                \u52a8\u4f5c / \u72af\u7f6a / \u60ca\u609a\n            ']
zone:  [u'\n                \u7f8e\u56fd\n            ']
number:  [u'\n                565\u4eba\n            ']
times:  [u'\n                11\u670807\u65e5\n\n            \n            ']
name:  [u'\u90a3\u5ea7\u6865']
types:  [u'\n                \u5267\u60c5 / \u5bb6\u5ead\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                123\u4eba\n            ']
times:  [u'\n                11\u670808\u65e5\n\n            \n            ']
name:  [u'\u51b3\u6218\u4e2d\u9014\u5c9b']
types:  [u'\n                \u5267\u60c5 / \u5386\u53f2 / \u6218\u4e89\n            ']
zone:  [u'\n                \u7f8e\u56fd\n            ']
number:  [u'\n                19179\u4eba\n            ']
times:  [u'\n                11\u670808\u65e5\n\n            \n            ']
name:  [u'\u5ba0\u7269\u8054\u76df']
types:  [u'\n                \u559c\u5267 / \u52a8\u753b\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646 / \u5fb7\u56fd\n            ']
number:  [u'\n                13110\u4eba\n            ']
times:  [u'\n                11\u670808\u65e5\n\n            \n            ']
name:  [u'\u53d7\u76ca\u4eba']
types:  [u'\n                \u5267\u60c5 / \u559c\u5267 / \u7231\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                10459\u4eba\n            ']
times:  [u'\n                11\u670808\u65e5\n\n            \n            ']
name:  [u'\u6211\u7684\u62f3\u738b\u7537\u53cb']
types:  [u'\n                \u7231\u60c5 / \u8fd0\u52a8\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                1517\u4eba\n            ']
times:  [u'\n                11\u670808\u65e5\n\n            \n            ']
name:  [u'\u6b66\u6797\u5b64\u513f']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                1282\u4eba\n            ']
times:  [u'\n                11\u670808\u65e5\n\n            \n            ']
name:  [u'\u8ba2\u4eb2']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                797\u4eba\n            ']
times:  [u'\n                11\u670808\u65e5\n\n            \n            ']
name:  [u'\u9ec4\u82b1\u5858\u5f80\u4e8b']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                563\u4eba\n            ']
times:  [u'\n                11\u670808\u65e5\n\n            \n            ']
name:  [u'\u5269\u5973\u89c5\u7231\u8bb0']
types:  [u'\n                \u5267\u60c5 / \u7231\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                528\u4eba\n            ']
times:  [u'\n                11\u670808\u65e5\n\n            \n            ']
name:  [u'\u4e00\u4e2a\u4eba\u7684\u57ce\u5e02']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                85\u4eba\n            ']
times:  [u'\n                11\u670808\u65e5\n\n            \n            ']
name:  [u'\u7231\u60c5\u56fe\u9274\u4e4b\u6697\u604b']
types:  [u'\n                \u7231\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                57\u4eba\n            ']
times:  [u'\n                11\u670808\u65e5\n\n            \n            ']
name:  [u'\u5c0f\u5fc3\u201c\u9677\u9631\u201d']
types:  [u'\n                \u559c\u5267\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                35\u4eba\n            ']
times:  [u'\n                11\u670809\u65e5\n\n            \n            ']
name:  [u'\u81f4\u656c\u82f1\u96c4']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                146\u4eba\n            ']
times:  [u'\n                11\u670810\u65e5\n\n            \n            ']
name:  [u'\u642d\u79cb\u5343\u7684\u4eba']
types:  [u'\n                \u5267\u60c5 / \u5bb6\u5ead\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                222\u4eba\n            ']
times:  [u'\n                11\u670811\u65e5\n\n            \n            ']
name:  [u'\u4ed6\u4eec\u5df2\u4e0d\u518d\u53d8\u8001']
types:  [u'\n                \u5386\u53f2 / \u6218\u4e89 / \u7eaa\u5f55\u7247\n            ']
zone:  [u'\n                \u82f1\u56fd / \u65b0\u897f\u5170\n            ']
number:  [u'\n                34460\u4eba\n            ']
times:  [u'\n                11\u670812\u65e5\n\n            \n            ']
name:  [u'\u5c0f\u8f7f\u8f66']
types:  [u'\n                \u5267\u60c5 / \u513f\u7ae5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                49\u4eba\n            ']
times:  [u'\n                11\u670815\u65e5\n\n            \n            ']
name:  [u'\u6d77\u4e0a\u94a2\u7434\u5e08']
types:  [u'\n                \u5267\u60c5 / \u97f3\u4e50\n            ']
zone:  [u'\n                \u610f\u5927\u5229\n            ']
number:  [u'\n                276733\u4eba\n            ']
times:  [u'\n                11\u670815\u65e5\n\n            \n            ']
name:  [u'\u76d7\u68a6\u7279\u653b\u961f']
types:  [u'\n                \u72af\u7f6a / \u52a8\u753b\n            ']
zone:  [u'\n                \u5308\u7259\u5229\n            ']
number:  [u'\n                70562\u4eba\n            ']
times:  [u'\n                11\u670815\u65e5\n\n            \n            ']
name:  [u'\u9ea6\u5b50\u7684\u76d6\u5934']
types:  [u'\n                \u5267\u60c5 / \u60ac\u7591\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                8836\u4eba\n            ']
times:  [u'\n                11\u670815\u65e5\n\n            \n            ']
name:  [u'\u9739\u96f3\u5a07\u5a03']
types:  [u'\n                \u52a8\u4f5c / \u5192\u9669\n            ']
zone:  [u'\n                \u7f8e\u56fd\n            ']
number:  [u'\n                8813\u4eba\n            ']
times:  [u'\n                11\u670815\u65e5\n\n            \n            ']
name:  [u'\u957f\u5b89\u9053']
types:  [u'\n                \u5267\u60c5 / \u72af\u7f6a / \u60ac\u7591\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                8217\u4eba\n            ']
times:  [u'\n                11\u670815\u65e5\n\n            \n            ']
name:  [u'\u5927\u7ea6\u5728\u51ac\u5b63']
types:  [u'\n                \u5267\u60c5 / \u7231\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                4809\u4eba\n            ']
times:  [u'\n                11\u670815\u65e5\n\n            \n            ']
name:  [u'\u840c\u5ba0\u7279\u5de5\u961f']
types:  [u'\n                \u52a8\u753b / \u5192\u9669 / \u5bb6\u5ead\n            ']
zone:  [u'\n                \u5fb7\u56fd / \u6bd4\u5229\u65f6\n            ']
number:  [u'\n                1696\u4eba\n            ']
times:  [u'\n                11\u670815\u65e5\n\n            \n            ']
name:  [u'\u90a3\u4e00\u591c\uff0c\u6211\u7ed9\u4f60\u5f00\u8fc7\u8f66']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                827\u4eba\n            ']
times:  [u'\n                11\u670815\u65e5\n\n            \n            ']
name:  [u'\u7236\u5b50\u62f3\u738b']
types:  [u'\n                \u8fd0\u52a8 / \u5bb6\u5ead\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                220\u4eba\n            ']
times:  [u'\n                11\u670815\u65e5\n\n            \n            ']
name:  [u'\u64bc\u5c71\u7476']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                82\u4eba\n            ']
times:  [u'\n                11\u670816\u65e5\n\n            \n            ']
name:  [u'\u4e00\u8f66\u56db\u4ec6']
types:  [u'\n                \u559c\u5267\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                71\u4eba\n            ']
times:  [u'\n                11\u670818\u65e5\n\n            \n            ']
name:  [u'\u6211\u5728\u539f\u5730\u7b49\u4f60']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                163\u4eba\n            ']
times:  [u'\n                11\u670819\u65e5\n\n            \n            ']
name:  [u'\u706b\u7ea2\u9752\u6625']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                39\u4eba\n            ']
times:  [u'\n                11\u670822\u65e5\n\n            \n            ']
name:  [u'\u51b0\u96ea\u5947\u7f182']
types:  [u'\n                \u559c\u5267 / \u52a8\u753b / \u5192\u9669\n            ']
zone:  [u'\n                \u7f8e\u56fd\n            ']
number:  [u'\n                39712\u4eba\n            ']
times:  [u'\n                11\u670822\u65e5\n\n            \n            ']
name:  [u'\u522b\u544a\u8bc9\u5979']
types:  [u'\n                \u5267\u60c5 / \u559c\u5267\n            ']
zone:  [u'\n                \u7f8e\u56fd\n            ']
number:  [u'\n                28430\u4eba\n            ']
times:  [u'\n                11\u670822\u65e5\n\n            \n            ']
name:  [u'\u4f60\u662f\u51f6\u624b']
types:  [u'\n                \u5267\u60c5 / \u72af\u7f6a / \u60ac\u7591\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                4883\u4eba\n            ']
times:  [u'\n                11\u670822\u65e5\n\n            \n            ']
name:  [u'\u96f7\u7c73\u5947\u9047\u8bb0']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u6cd5\u56fd\n            ']
number:  [u'\n                2990\u4eba\n            ']
times:  [u'\n                11\u670822\u65e5\n\n            \n            ']
name:  [u'\u8ffd\u51f6\u5341\u4e5d\u5e74']
types:  [u'\n                \u5267\u60c5 / \u72af\u7f6a / \u60ac\u7591\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                2971\u4eba\n            ']
times:  [u'\n                11\u670822\u65e5\n\n            \n            ']
name:  [u'\u5f81\u9014']
types:  [u'\n                \u52a8\u4f5c / \u5192\u9669\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                2262\u4eba\n            ']
times:  [u'\n                11\u670822\u65e5\n\n            \n            ']
name:  [u'\u72ac\u7231']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                210\u4eba\n            ']
times:  [u'\n                11\u670822\u65e5\n\n            \n            ']
name:  [u'\u5a31\u4e50\u8ffd\u51fb']
types:  [u'\n                \u559c\u5267 / \u52a8\u4f5c\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                54\u4eba\n            ']
times:  [u'\n                11\u670822\u65e5\n\n            \n            ']
name:  [u'\u9aa8\u74f7']
types:  [u'\n                \u6050\u6016\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                25\u4eba\n            ']
times:  [u'\n                11\u670822\u65e5\n\n            \n            ']
name:  [u'\u7231\xb7\u4e4b\u75d5']
types:  [u'\n                \u5267\u60c5 / \u7231\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                24\u4eba\n            ']
times:  [u'\n                11\u670828\u65e5\n\n            \n            ']
name:  [u'\u5f52\u53bb']
types:  [u'\n                \u5267\u60c5 / \u5bb6\u5ead\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                413\u4eba\n            ']
times:  [u'\n                11\u670829\u65e5\n\n            \n            ']
name:  [u'\u4e24\u53ea\u8001\u864e']
types:  [u'\n                \u5267\u60c5 / \u559c\u5267\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                11856\u4eba\n            ']
times:  [u'\n                11\u670829\u65e5\n\n            \n            ']
name:  [u'\u5e73\u539f\u4e0a\u7684\u590f\u6d1b\u514b']
types:  [u'\n                \u5267\u60c5 / \u559c\u5267 / \u60ac\u7591\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                7393\u4eba\n            ']
times:  [u'\n                11\u670829\u65e5\n\n            \n            ']
name:  [u'\u8863\u67dc\u91cc\u7684\u5192\u9669\u738b']
types:  [u'\n                \u5267\u60c5 / \u559c\u5267\n            ']
zone:  [u'\n                \u6cd5\u56fd / \u5370\u5ea6\n            ']
number:  [u'\n                4128\u4eba\n            ']
times:  [u'\n                11\u670829\u65e5\n\n            \n            ']
name:  [u'\u4e00\u751f\u6709\u4f60']
types:  [u'\n                \u7231\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                2247\u4eba\n            ']
times:  [u'\n                11\u670829\u65e5\n\n            \n            ']
name:  [u'\u51b0\u5cf0\u66b4']
types:  [u'\n                \u52a8\u4f5c / \u707e\u96be\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                1262\u4eba\n            ']
times:  [u'\n                12\u670806\u65e5\n\n            \n            ']
name:  [u'\u5357\u65b9\u8f66\u7ad9\u7684\u805a\u4f1a']
types:  [u'\n                \u5267\u60c5 / \u72af\u7f6a\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646 / \u6cd5\u56fd\n            ']
number:  [u'\n                88168\u4eba\n            ']
times:  [u'\n                12\u670806\u65e5\n\n            \n            ']
name:  [u'\u5439\u54e8\u4eba']
types:  [u'\n                \u5267\u60c5 / \u72af\u7f6a / \u60ac\u7591\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                16836\u4eba\n            ']
times:  [u'\n                12\u670806\u65e5\n\n            \n            ']
name:  [u'\u7f51\u7edc\u51f6\u94c3']
types:  [u'\n                \u60ca\u609a / \u6050\u6016\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                106\u4eba\n            ']
times:  [u'\n                12\u670806\u65e5\n\n            \n            ']
name:  [u'\u8ff7\u5c40\u4f0f\u9999']
types:  [u'\n                \u5267\u60c5 / \u72af\u7f6a / \u60ac\u7591\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646 / \u4e2d\u56fd\u6fb3\u95e8\n            ']
number:  [u'\n                28\u4eba\n            ']
times:  [u'\n                12\u670807\u65e5\n\n            \n            ']
name:  [u'\u5170\u5fc3\u5927\u5267\u9662']
types:  [u'\n                \u5267\u60c5 / \u52a8\u4f5c\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                65738\u4eba\n            ']
times:  [u'\n                12\u670807\u65e5\n\n            \n            ']
name:  [u'\u9f99\u4e4b\u8c37\uff1a\u7834\u6653\u5947\u5175']
types:  [u'\n                \u52a8\u753b / \u5947\u5e7b / \u5192\u9669\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                7690\u4eba\n            ']
times:  [u'\n                12\u670812\u65e5\n\n            \n            ']
name:  [u'\u5929\u706b']
types:  [u'\n                \u52a8\u4f5c / \u5192\u9669\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                578\u4eba\n            ']
times:  [u'\n                12\u670813\u65e5\n\n            \n            ']
name:  [u'\u88ab\u5149\u6293\u8d70\u7684\u4eba']
types:  [u'\n                \u5267\u60c5 / \u7231\u60c5 / \u79d1\u5e7b\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                6113\u4eba\n            ']
times:  [u'\n                12\u670820\u65e5\n\n            \n            ']
name:  [u'\u53f6\u95ee4']
types:  [u'\n                \u4f20\u8bb0 / \u52a8\u4f5c / \u5386\u53f2\n            ']
zone:  [u'\n                \u4e2d\u56fd\u9999\u6e2f\n            ']
number:  [u'\n                8283\u4eba\n            ']
times:  [u'\n                12\u670820\u65e5\n\n            \n            ']
name:  [u'\u53ea\u6709\u82b8\u77e5\u9053']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                4500\u4eba\n            ']
times:  [u'\n                12\u670820\u65e5\n\n            \n            ']
name:  [u'\u8bef\u6740']
types:  [u'\n                \u5267\u60c5 / \u72af\u7f6a\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                1574\u4eba\n            ']
times:  [u'\n                12\u670820\u65e5\n\n            \n            ']
name:  [u'\u767d\u65e5\u8ff7\u96fe']
types:  [u'\n                \u72af\u7f6a / \u60ac\u7591\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                456\u4eba\n            ']
times:  [u'\n                12\u670824\u65e5\n\n            \n            ']
name:  [u'\u8fc7\u6e21\u7a7a\u95f4']
types:  [u'\n                \u52a8\u4f5c / \u79d1\u5e7b / \u60ac\u7591\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                952\u4eba\n            ']
times:  [u'\n                12\u670828\u65e5\n\n            \n            ']
name:  [u'\u7ad9\u4f4f\uff01\u5c0f\u5077']
types:  [u'\n                \u559c\u5267\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                20521\u4eba\n            ']
times:  [u'\n                12\u670828\u65e5\n\n            \n            ']
name:  [u'\u4e2d\u534e\u718a\u732b']
types:  [u'\n                \u52a8\u753b\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                146\u4eba\n            ']
times:  [u'\n                12\u670831\u65e5\n\n            \n            ']
name:  [u'\u5ba0\u7231']
types:  [u'\n                \u5267\u60c5 / \u559c\u5267 / \u7231\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                4201\u4eba\n            ']
times:  [u'\n                12\u6708\n\n            \n            ']
name:  [u'\u6625\u6c5f\u6c34\u6696']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                8180\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670801\u65e5\n\n            \n            ']
name:  [u'\u8d1d\u80af\u718a2\uff1a\u91d1\u724c\u7279\u5de5']
types:  [u'\n                \u52a8\u753b\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                186\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670801\u65e5\n\n            \n            ']
name:  [u'\u963f\u91cc\u5df4\u5df4\u4e0e\u795e\u706f']
types:  [u'\n                \u52a8\u753b / \u5192\u9669\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                93\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670811\u65e5\n\n            \n            ']
name:  [u'\u5c71\u6d77\u7ecf\u4e4b\u5c0f\u4eba\u56fd']
types:  [u'\n                \u513f\u7ae5 / \u52a8\u753b / \u5192\u9669\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646 / \u7f8e\u56fd\n            ']
number:  [u'\n                264\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670825\u65e5\n\n            \n            ']
name:  [u'\u5510\u4eba\u8857\u63a2\u68483']
types:  [u'\n                \u559c\u5267 / \u52a8\u4f5c / \u60ac\u7591\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                81628\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670825\u65e5\n\n            \n            ']
name:  [u'\u59dc\u5b50\u7259']
types:  [u'\n                \u52a8\u753b\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                31699\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670825\u65e5\n\n            \n            ']
name:  [u'\u4e2d\u56fd\u5973\u6392']
types:  [u'\n                \u5267\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646 / \u4e2d\u56fd\u9999\u6e2f\n            ']
number:  [u'\n                24300\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670825\u65e5\n\n            \n            ']
name:  [u'\u7d27\u6025\u6551\u63f4']
types:  [u'\n                \u5267\u60c5 / \u52a8\u4f5c / \u707e\u96be\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                20230\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670825\u65e5\n\n            \n            ']
name:  [u'\u56e7\u5988']
types:  [u'\n                \u5267\u60c5 / \u559c\u5267\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                18789\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670825\u65e5\n\n            \n            ']
name:  [u'\u6025\u5148\u950b']
types:  [u'\n                \u52a8\u4f5c / \u5192\u9669\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                4625\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670825\u65e5\n\n            \n            ']
name:  [u'\u91d1\u7985\u964d\u9b54']
types:  [u'\n                \u52a8\u4f5c / \u5947\u5e7b / \u5192\u9669\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                749\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670825\u65e5\n\n            \n            ']
name:  [u'\u5927\u7ea2\u5305']
types:  [u'\n                \u559c\u5267 / \u7231\u60c5\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                511\u4eba\n            ']
times:  [u'\n                2020\u5e7401\u670825\u65e5\n\n            \n            ']
name:  [u'\u5446\u74dc\u5144\u5f1f\u4e4b\u5feb\u4e50\u51ac\u5929']
types:  [u'\n                \u559c\u5267 / \u52a8\u753b\n            ']
zone:  [u'\n                \u6377\u514b / \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                262\u4eba\n            ']
times:  [u'\n                2020\u5e7406\u670821\u65e5\n\n            \n            ']
name:  [u'\u516d\u6708\u7684\u79d8\u5bc6']
types:  [u'\n                \u5267\u60c5 / \u60ac\u7591 / \u97f3\u4e50\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646 / \u7f8e\u56fd\n            ']
number:  [u'\n                1703\u4eba\n            ']
times:  [u'\n                2020\u5e7410\u670801\u65e5\n\n            \n            ']
name:  [u'\u9ed1\u8272\u5047\u9762']
types:  [u'\n                \u5267\u60c5 / \u60ac\u7591\n            ']
zone:  [u'\n                \u4e2d\u56fd\u5927\u9646\n            ']
number:  [u'\n                5038\u4eba\n            ']
[Finished in 1.5s]

需要的话可以拿去使用,数据我是直接打印出来了没有做清洗处理,可以根据自己需要进行处理分析。

Python爬取豆瓣网中即将上映的电影数据清单相关推荐

  1. Python爬虫实战(1) | 爬取豆瓣网排名前250的电影(下)

    在Python爬虫实战(1) | 爬取豆瓣网排名前250的电影(上)中,我们最后爬出来的结果不是很完美,这对于"精益求精.追求完美的"程序猿来说怎么能够甘心 所以,今天,用pyth ...

  2. Python爬取豆瓣网影评展示

    Python爬取豆瓣网影评展示 需要的库文件 requests beautifulsoup wordcloud jieba matplotlib 本文思想 1.访问指定的网页 #获取指定url的内容 ...

  3. Python爬虫实战(1) | 爬取豆瓣网排名前250的电影(上)

    今天我们来爬取一下豆瓣网上排名前250的电影. 需求:爬取豆瓣网上排名前250的电影,然后将结果保存至一个记事本里. 开发环境: python3.9 pycharm2021专业版 我们先观察网页,看看 ...

  4. python爬取东方财富网中的资金流向表

    因为东方财富网中的资金流向表是一个动态的数据,所以采用selenium模块进行爬取. 爬取东方财富网的资金流向表的具体步骤: 1.获取初始的URL 2.爬取对应的URL地址的网页,获取新的URL地址 ...

  5. Python爬取返利网(今日值得买)数据

    双十一还没消停,双十二又来了.看返利网<今日值得买>的数据时时不断的在更新...... 1.爬取返利网的商品名,分类,推荐人,好评数和差评数 2.商品信息不断更新,查看页面源代码仅可以看见 ...

  6. python爬取豆瓣网即将上映的电影,数据信息存储到json文件

    1,import库的安装,在我其它博文中有:获取豆瓣网即将上映的网页信息即HTML页面. 2,解析获取到的网页的数据信息 3将获取到的数据信息,放到json文件 4,主程序

  7. python爬取豆瓣网评并写入excel表格中

    为了爬取网评我们需要导入几个模块 from selenium import webdriver import time import xlwt 先定义要爬取的网站url'以及设置浏览器参数 movie ...

  8. python爬取豆瓣网资源DIY影讯

    输出结果: 名字:哆啦A梦:伴我同行2,链接:https://movie.douban.com/subject/34913671/,日期:05月28日,类型:剧情 / 动画,地区:日本, 关注者:17 ...

  9. 爬去豆瓣网中电影信息并保存到本地目录当中

    爬取豆瓣网中电影信息并保存到本地目录当中 读者可以根据源代码来设计自己的爬虫,url链接不能通用,由于源代码中后续查找筛选中有不同类或者标签名,仅供参考,另外推荐b站上一个老师,叫路飞学城IT的,讲的 ...

  10. Python爬取天气网历史天气数据

    我的第一篇博客,哈哈哈,记录一下我的Python进阶之路! 今天写了一个简单的爬虫. 使用Python的requests 和BeautifulSoup模块,Python 2.7.12可在命令行中直接使 ...

最新文章

  1. composer update 的时候提示the requested PHP extension pcntl is missing from your system.的方法处理
  2. 分布式实时计算—实时数据质量如何保障?
  3. window下jansson安装和使用
  4. 【最短路】【图论】【Floyed】牛的旅行(ssl 1119/luogu 1522)
  5. python之pyqt5-第一个pyqt5程序-图像压缩工具(2.0版本)-小记
  6. 支付宝老年大学招95后青年讲师:不要大厂经验高学历,只要会跳广场舞会钓鱼?...
  7. linux怎么使用git安装目录,Linux系统中怎么安装Git?
  8. WPF中如何创建服务
  9. SPOJ 9939 Eliminate the Conflict
  10. 关闭窗口(window.close)
  11. 我有你没有游戏例子100_50米的决赛圈里面藏着100个人?光子:知道什么叫质量局了吧!...
  12. JavaScript运算符运算优先级
  13. Redis配置文件redis.conf配置详解
  14. 起心动念成大愿,点亮心灯祝世界 “点亮心灯祝福世界”活动圆满收官
  15. Pycharm: ImportError: attempted relative import with no known parent package解决方案
  16. Python编程笔记(第三篇)【补充】三元运算、文件处理、检测文件编码、递归、斐波那契数列、名称空间、作用域、生成器...
  17. The Thirteenth Of Word-Day
  18. 网页不显示验证码的原因与处理方法
  19. ldpc译码讲解_LDPC编译码基本原理
  20. Important Programming Concepts (Even on Embedded Systems) Part V: State Machines

热门文章

  1. 前端开发中,如何优化图像?图像格式的区别?
  2. XML文件处理总结 - 1
  3. 读取xml数据装配到字典中之应用场景
  4. 黑客攻击行为特征分析 反攻击技术综合性分析报告
  5. 兼容pc端和移动端的轮播图插件 swiper.js
  6. 2016-2017-2 20155309 南皓芯《java程序设计》第八周学习总结
  7. servlet 与 tomcat版本不匹配的问题
  8. 图片少量显示 9张一下 类似微信,微博客户端
  9. Linux系统编程(4)——文件与IO之ioctl函数
  10. python的pandas库中read_table的参数