在前面博客中写过python爬虫爬取各个城市历史天气及数据可视化,感觉还是有点粗糙,本文章是借这个例子抛砖引玉,写出个更NB的python全栈项目。

此项目预览页面全国各城市天气可视化系统

scrapy抓取全国各城市的历史天气

创建项目及准备工作

创建项目

scrapy startproject lishitianqi

创建爬虫脚本文件(考虑到tianqihoubao.com访问速度实在是太慢,现在换用lishi.tianqi.com)

cd lishitianqi
scrapy genspider weath lishi.tianqi.com

测试该url是否可以直接爬取 

scrapy crawl weath
...
2019-09-09 13:22:21 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 http://lishi.tianqi.com/>: HTTP status code is not handled or not allowed
...

在测试是否是user-agent导致

In [15]: import requestsIn [16]: requests.get("https://lishi.tianqi.com/")
Out[16]: <Response [403]>In [17]: requests.get("https://lishi.tianqi.com/",headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) Appl...: eWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36'})
Out[17]: <Response [200]>

在setings.py中设置user-agent的值

USER_AGENT='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36'

编写爬虫文件获取天气数据

请求http://lishi.tianqi.com/后将返回全国所有城市页面,通过css选择器选择所有城市url,并添加到爬虫中

def parse(self, response):#选择中国所有城市名及对应urlcitys = response.css(".bcity a[target='_blank']::text").getall()cityurls = response.css(".bcity a[target='_blank']::attr(href)").getall()#将url添加到爬虫中,for city,url in zip(citys,cityurls):yield scrapy.Request(response.urljoin(url), meta={'city':city}, callback=self.city)def city(self, response):pass

请求http://lishi.tianqi.com/beijing/index.html,返回北京2011年1月-至今的所有月份页面,通过css选择器选择该城市所有月份url,并添加到爬虫中

def city(self, response):#选择城市所有年月份及对应urlmonths = response.css(".tqtongji1 a::text").getall()monthurls = response.css(".tqtongji1 a::attr(href)").getall()#将url添加到爬虫中for url in monthurls:yield scrapy.Request(response.urljoin(url), meta={'city':response.meta['city']}, callback=self.month)def month(self, response):pass

请求http://lishi.tianqi.com/beijing/201101.html,返回北京2011年1月天气页面,通过css选择器选择该月的每天的天气

    def month(self, response):#选择该月份中每天的天气数据weaths =  response.css(".tqtongji2 ul[class!='t1'] li::text").getall()#日期date = response.css(".tqtongji2 ul[class!='t1'] a::text").getall()#最大温度maxt = weaths[0::5]#最小温度mint = weaths[1::5]#天气weath = weaths[2::5]#风向windd = weaths[3::5]#风力windp = weaths[4::5]#打印该数据for date, maxt, mint, weath, windd, windp in zip(date, maxt, mint, weath, windd, windp):print(response.meta['city'],date, maxt, mint, weath, windd, windp)

scrapy数据存储到MariaDB

创建数据库和表

MariaDB [(none)]> create database lishitianqi;
Query OK, 1 row affected (0.209 sec)MariaDB [(none)]> use lishitianqi;
Database changed
MariaDB [lishitianqi]> CREATE TABLE `lishitianqi` (`id` int(11) NOT NULL AUTO_INCREMENT,`city` varchar(16) NOT NULL,`date` date NOT NULL,`maxt` int(11) NOT NULL,`mint` int(11) NOT NULL,`weath` varchar(20) NOT NULL,`windd` varchar(20) NOT NULL,`windp` varchar(40) DEFAULT NULL,PRIMARY KEY (`id`),UNIQUE KEY `city` (`city`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=30673 DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (2.030 sec)

编写模型文件items.py

class LishitianqiItem(scrapy.Item):# define the fields for your item here like:city = scrapy.Field()date = scrapy.Field()maxt = scrapy.Field()mint = scrapy.Field()weath = scrapy.Field()windd = scrapy.Field()windp = scrapy.Field()

修改爬虫文件weath.py

 1 # -*- coding: utf-8 -*-
 2 import scrapy
 3 from lishitianqi.items import LishitianqiItem
 4
 5 class WeathSpider(scrapy.Spider):
 6     name = 'weath'
 7     allowed_domains = ['lishi.tianqi.com']
 8     start_urls = ['http://lishi.tianqi.com/']
 9
10     def parse(self, response):
11         #选择中国所有城市名及对应url
12         citys = response.css(".bcity a[target='_blank']::text").getall()
13         cityurls = response.css(".bcity a[target='_blank']::attr(href)").getall()
14
15         #将url添加到爬虫中,
16         for city,url in zip(citys[:1],cityurls[:1]):
17             yield scrapy.Request(response.urljoin(url), meta={'city':city}, callback=self.city)
18
19     def city(self, response):
20         #选择城市所有年月份及对应url
21         months = response.css(".tqtongji1 a::text").getall()
22         monthurls = response.css(".tqtongji1 a::attr(href)").getall()
23
24         #将url添加到爬虫中
25         for url in monthurls:
26             yield scrapy.Request(response.urljoin(url), meta={'city':response.meta['city']}, callback=self.month)
27
28     def month(self, response):
29         #选择该月份中每天
30         for ul in response.css(".tqtongji2 ul[class!='t1']"):
31             #创建item,将数据传递个pipeline
32             item = LishitianqiItem()
33             #获取城市名
34             item['city'] = response.meta['city']
35
36             a,b,c,d,e,f = ul.css('li')
37             #获取日期
38             item['date'] = a.css("::text").get()
39             #获取日期
40             item['maxt'] = b.css("::text").get()
41             #获取日期
42             item['mint'] = c.css("::text").get()
43             #获取日期
44             item['weath'] = d.css("::text").get()
45             #获取日期
46             item['windd'] = e.css("::text").get()
47             #获取日期
48             item['windp'] = f.css("::text").get()
49
50             yield item

View Code

编写项目管道中间件pipeline.py

import pymysql.cursorsclass LishitianqiPipeline(object):def __init__(self):#连接数据库self.connect = pymysql.connect(host='localhost',port=3306,db='lishitianqi',user='root',passwd='123',charset='utf8',use_unicode=True)#通过cursor执行增删该查self.cursor = self.connect.cursor()def process_item(self, item, spider):#插入天气数据
        self.cursor.execute("""insert into lishitianqi(city, date, maxt, mint, weath, windd, windp)values (%s, %s, %s, %s, %s, %s, %s)""",(item['city'],item['date'],item['maxt'],item['mint'],item['weath'],item['windd'],item['windp']))#提交sql语句
        self.connect.commit()#必须实现返回return item

在settings.py中启用mysqlPipeline

ITEM_PIPELINES = {'lishitianqi.pipelines.LishitianqiPipeline': 300,#格式为:'项目名.文件名.类名':优先级(越小越大)
}

爬虫性能优化

首先看一下目标数据有多大,估计要多长时间能抓取完

查看有多少个城市

#scrapy shell https://lishi.tianqi.com/
In [1]: citys = response.css(".bcity a[target='_blank']::text").getall()In [2]: len(citys)
Out[2]: 3147

将weath.py的16行代码修改为for city,url in zip(citys[:1],cityurls[:1]):,运行爬虫,验证爬取一个城市数据耗时

2019-09-09 17:37:52 [scrapy.utils.log] INFO: Scrapy 1.7.3 started (bot: lishitianqi)...
2019-09-09 17:41:54 [scrapy.core.engine] INFO: Spider closed (finished)

一个城市耗时4分2s,共需要8.7天才能抓取到所有数据。在我前面写MySQL日志详解时,将二进制日志打开了,这个对性能影响比较大

关掉二进制日志,在重启服务

[mysqld]
#log_bin=bin_log_file
#binlog_format=row
#sync_binlog=1
#server_id=7  

查看是否生效

MariaDB [(none)]> show variables like "log_bin";
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin       | OFF   |
+---------------+-------+
1 row in set (0.001 sec)

运行爬虫,验证关闭了二进制日志后抓取一个城市数据耗时

2019-09-09 17:50:14 [scrapy.utils.log] INFO: Scrapy 1.7.3 started (bot: lishitianqi)
...
2019-09-09 17:53:01 [scrapy.core.engine] INFO: Spider closed (finished)

耗时2分47秒,这个数据还是有点大,验证不插入数据库纯爬取网页的耗时,将weath.py第50行代码注释掉

          #yield item

运行爬虫,验证不写入数据库,抓取一个城市数据耗时  

2019-09-09 18:40:07 [scrapy.utils.log] INFO: Scrapy 1.7.3 started (bot: lishitianqi)
2019-09-09 18:40:12 [scrapy.core.engine] INFO: Spider closed (finished)

不写入数据库只需要5s,看来还是要在插入数据库效率上找原因。

在上面的代码实现,是以天为单位调用插入接口,可以按月批量插入来提上效率

修改后的weath.py

 1 # -*- coding: utf-8 -*-
 2 import scrapy
 3 from lishitianqi.items import LishitianqiItem
 4
 5 class WeathSpider(scrapy.Spider):
 6     name = 'weath'
 7     allowed_domains = ['lishi.tianqi.com']
 8     start_urls = ['http://lishi.tianqi.com/']
 9
10     def parse(self, response):
11         #选择中国所有城市名及对应url
12         citys = response.css(".bcity a[target='_blank']::text").getall()
13         cityurls = response.css(".bcity a[target='_blank']::attr(href)").getall()
14
15         #将url添加到爬虫中,
16         for city,url in zip(citys[:1],cityurls[:1]):
17             yield scrapy.Request(response.urljoin(url), meta={'city':city}, callback=self.city)
18
19     def city(self, response):
20         #选择城市所有年月份及对应url
21         months = response.css(".tqtongji1 a::text").getall()
22         monthurls = response.css(".tqtongji1 a::attr(href)").getall()
23
24         #将url添加到爬虫中
25         for url in monthurls:
26             yield scrapy.Request(response.urljoin(url), meta={'city':response.meta['city']}, callback=self.month)
27
28     def month(self, response):
29         data = []
30         #选择该月份中每天
31         for ul in response.css(".tqtongji2 ul[class!='t1']"):
32             data.append(tuple([response.meta['city']] + [li.css("::text").get() for li in ul.css('li')]))
33
34         #创建item,将数据传递个pipeline
35         item = LishitianqiItem()
36         item['data'] = data
37         yield item

View Code

修改后的items.py

 1 # -*- coding: utf-8 -*-
 2
 3 # Define here the models for your scraped items
 4 #
 5 # See documentation in:
 6 # https://docs.scrapy.org/en/latest/topics/items.html
 7
 8 import scrapy
 9
10 class LishitianqiItem(scrapy.Item):
11     # define the fields for your item here like:
12     city = scrapy.Field()
13     date = scrapy.Field()
14     maxt = scrapy.Field()
15     mint = scrapy.Field()
16     weath = scrapy.Field()
17     windd = scrapy.Field()
18     windp = scrapy.Field()
19     data = scrapy.Field()

View Code

修改后的pipelines.py

 1 # -*- coding: utf-8 -*-
 2
 3 # Define your item pipelines here
 4 #
 5 # Don't forget to add your pipeline to the ITEM_PIPELINES setting
 6 # See: https://docs.scrapy.org/en/latest/topics/item-pipeline.html
 7
 8 import pymysql.cursors
 9
10 class LishitianqiPipeline(object):
11
12     def __init__(self):
13         #连接数据库
14         self.connect = pymysql.connect(
15             host='localhost',
16             port=3306,
17             db='lishitianqi',
18             user='root',
19             passwd='123',
20             charset='utf8',
21             use_unicode=True)
22
23         #通过cursor执行增删该查
24         self.cursor = self.connect.cursor()
25
26     def process_item(self, item, spider):
27
28         #插入天气数据
29         self.cursor.executemany(
30             """insert ignore into lishitianqi(city, date, maxt, mint, weath, windd, windp)
31             values (%s, %s, %s, %s, %s, %s, %s)""", item['data'])
32
33         #提交sql语句
34         self.connect.commit()
35
36         #必须实现返回
37         return item

View Code

抓取一个城市数据耗时

2019-09-09 19:30:45 [scrapy.utils.log] INFO: Scrapy 1.7.3 started (bot: lishitianqi)
2019-09-09 19:30:56 [scrapy.core.engine] INFO: Spider closed (finished)

耗时11s,抓取所有数据共计9.6小时,这个时间可以接受了,睡个觉第二天就好了

数据可靠性优化及修复

从网站抓取来的数据并非可靠,在前面代码中或许你们发现了,在抓取每天天气数据只使用了一次css选择,然后切片分出各种类型的天气,

但是这种方式严格依赖数据格式,如果少了一个数据就会导致数据整体偏移,切片出来的结果混乱,后面我多次css选择解决了这个问题。

除了上面描述的一天中某种类型天气数据缺失,还可能有某天数据缺失,比如:

MariaDB [lishitianqi]> select * from lishitianqi where city="苏右旗" and date like "2011-01%";
+-------+--------+------------+------+------+-----------+---------------+-------------+
| id    | city   | date       | maxt | mint | weath     | windd         | windp       |
+-------+--------+------------+------+------+-----------+---------------+-------------+
| 20905 | 苏右旗 | 2011-01-01 |  -13 |  -23 | 晴        | 东北风~东风   | 4-5级       |
| 20906 | 苏右旗 | 2011-01-02 |  -12 |  -23 | 多云      | 东风~西南风   | 5-6级~4-5级 |
| 20907 | 苏右旗 | 2011-01-03 |  -13 |  -24 | 多云      | 西南风~西北风 | 4-5级~5-6级 |
| 20908 | 苏右旗 | 2011-01-04 |  -15 |  -25 | 小雪~多云 | 西北风        | 5-6级~4-5级 |
| 20909 | 苏右旗 | 2011-01-05 |  -17 |  -24 | 多云      | 西北风        | 5-6级~4-5级 |
| 20910 | 苏右旗 | 2011-01-06 |  -15 |  -22 | 晴        | 西北风~西风   | 5-6级~4-5级 |
| 20911 | 苏右旗 | 2011-01-07 |  -16 |  -24 | 晴        | 西北风        | 5-6级~4-5级 |
| 20912 | 苏右旗 | 2011-01-08 |  -18 |  -27 | 晴        | 西风          | 5-6级~4-5级 |
| 20913 | 苏右旗 | 2011-01-09 |  -12 |  -23 | 多云      | 西风          | 5-6级~4-5级 |
| 20914 | 苏右旗 | 2011-01-10 |  -16 |  -23 | 晴        | 西风          | 5-6级~4-5级 |
| 20915 | 苏右旗 | 2011-01-11 |  -16 |  -28 | 晴~多云   | 西风          | 4-5级       |
| 20916 | 苏右旗 | 2011-01-12 |  -17 |  -27 | 小雪~多云 | 西北风        | 5-6级~4-5级 |
| 20917 | 苏右旗 | 2011-01-13 |  -18 |  -26 | 多云      | 西北风        | 4-5级~5-6级 |
| 20918 | 苏右旗 | 2011-01-14 |  -14 |  -28 | 晴        | 北风~东北风   | 4-5级       |
| 20919 | 苏右旗 | 2011-01-15 |  -15 |  -25 | 多云~晴   | 西北风~北风   | 5-6级~4-5级 |
| 20920 | 苏右旗 | 2011-01-16 |  -18 |  -30 | 小雪~多云 | 西北风~西风   | 5-6级~4-5级 |
| 20921 | 苏右旗 | 2011-01-26 |  -12 |  -26 | 晴        | 西北风        | 5-6级       |
| 20922 | 苏右旗 | 2011-01-29 |  -14 |  -25 | 多云~晴   | 西北风        | 4-5级~5-6级 |
| 20923 | 苏右旗 | 2011-01-30 |  -10 |  -19 | 晴        | 西北风        | 4-5级~5-6级 |
| 20924 | 苏右旗 | 2011-01-31 |   -6 |  -19 | 晴        | 西风          | 4-5级~5-6级 |
+-------+--------+------------+------+------+-----------+---------------+-------------+
20 rows in set (0.002 sec)

其中缺失了17日-25日数据,我的解决方案是:若有缺失数据,尝试从备用网站获取,将数据补全,若备用网站没有该城市或同样缺失该数据的按缺失区间取平均值补全。

备用网站tianqihoubao.com,创建一个数据修复蜘蛛

scrapy genspider recovery tianqihoubao.com

将https://lishi.tianqi.com/支持查询的所有城市列表存入citys_primary.txt

#scrapy shell https://lishi.tianqi.com/
...
In [1]: citys = response.css(".bcity a[target='_blank']::text").getall()In [2]: f = open("citys_primary.txt","+w",encoding="utf-8")In [3]: f.write(str(citys))
Out[3]: 19326In [4]: f.close()

将http://www.tianqihoubao.com/lishi/支持查询的所有城市列表存入citys_backup.txt

scrapy shell http://www.tianqihoubao.com/lishi/
...
In [9]: text = response.body.decode("GB2312")In [10]: from scrapy.selector import SelectorIn [11]: sel = Selector(text=text)In [12]: citys = sel.css(".citychk dd a::text").getall()

In [13]: citys = [city.strip(' ') for city in citys]In [14]: f = open("citys_backup.txt","+w",encoding="utf-8")In [15]: f.write(str(citys))
Out[15]: 2601In [16]: f.close()

爬虫文件recovery.py

 1 # -*- coding: utf-8 -*-
 2 import scrapy
 3 import re
 4 from lishitianqi.items import LishitianqiItem
 5
 6 class RecoverySpider(scrapy.Spider):
 7     name = 'recovery'
 8     allowed_domains = ['tianqihoubao.com']
 9
10     def parse(self, response):
11         tableTitle = response.css(".wdetail h1::text").get()
12         #获取城市名,为了防止出现空列表导致KeyError故用join,''.join([]) == ''
13         city = ''.join(re.findall("([\u4E00-\u9FA5]+)历史天气预报",tableTitle))
14
15         data = []
16         for tr in response.css(".wdetail table tr")[1:]:
17             tds = tr.css("td")
18             #获取日期
19             date = tds[0].css("a::text").get()
20             dl = re.findall("(\d{4})年(\d{2})月(\d{2})日",date)
21             date = '-'.join(dl[0])  if dl!= [] else '1990-00-00'
22
23             #获取天气
24             weath = tds[1].css("::text").get()
25             wl = re.findall("[\u4E00-\u9FA5]+", weath)
26             weath = '~'.join(wl) if len(wl) == 2 else '未知~未知'
27
28             #获取最低、最高气温
29             tem = tds[2].css("::text").get()
30             tl = re.findall("(\d+|-\d+)℃", tem)
31             maxt,mint = tl if len(tl) == 2 else [0,0]
32
33             #获取风向,风速度
34             win = tds[3].css("::text").get()
35             wil = re.findall("[\u4E00-\u9FA50-9-\u2265\u2264]+", win)
36             wil = [re.sub('[≤]','小于等于', w) for w in wil]
37             wil = [re.sub('[≥]','大于等于', w) for w in wil]
38             windd,windp = ('~'.join(wil[0::2]),'~'.join(wil[1::2])) if len(wil) == 4 else ("未知","未知")
39
40             data.append([city, date, maxt, mint, weath, windd, windp])
41
42         #创建item,将数据传递个pipeline
43         item = LishitianqiItem()
44         item['data'] = data
45         yield item

View Code

修改pipelines.py文件

 1 # -*- coding: utf-8 -*-
 2
 3 # Define your item pipelines here
 4 #
 5 # Don't forget to add your pipeline to the ITEM_PIPELINES setting
 6 # See: https://docs.scrapy.org/en/latest/topics/item-pipeline.html
 7
 8 import pymysql.cursors
 9
10 #用于获取某年某月有多少天,
11 import calendar
12
13 #用于获取汉字的拼音, pip install xpinyin
14 from xpinyin import Pinyin
15
16
17 class LishitianqiPipeline(object):
18
19     def __init__(self):
20         #连接数据库
21         self.connect = pymysql.connect(
22             host='localhost',
23             port=3306,
24             db='lishitianqi',
25             user='root',
26             passwd='123',
27             charset='utf8',
28             use_unicode=True)
29
30         #通过cursor执行增删该查
31         self.cursor = self.connect.cursor()
32
33     def process_item(self, item, spider):
34
35         #插入天气数据
36         self.cursor.executemany(
37             """insert ignore into lishitianqi(city, date, maxt, mint, weath, windd, windp)
38             values (%s, %s, %s, %s, %s, %s, %s)""", item['data'])
39
40         #提交sql语句
41         self.connect.commit()
42
43         #必须实现返回
44         return item
45
46     #在爬虫运行前执行这个函数
47     def open_spider(self, spider):
48
49         if spider.name == 'recovery' and True:
50             #读取https://lishi.tianqi.com/和http://www.tianqihoubao.com/lishi/支持查询的所有城市列表存, eval将字符串变为python语句, 转换为set用于取交集
51             citys_primary = set(eval(open("citys_primary.txt","r",encoding="utf-8").read()))
52             citys_backup = set(eval(open("citys_backup.txt","r",encoding="utf-8").read()))
53             startyear = 2011
54             endyear = 2019
55             endmonth = 9
56             pin = Pinyin()
57
58             def check_or_appendurl(city,year,month):
59                 #计算year年month月有多少天
60                 stmr = calendar.monthrange(year, month)[1]
61                 #通过sql获取city城市year年month月有多少天
62                 sql = "select count(city) from lishitianqi where city='%s' and date like '%d-%02d%%'" % (city,year,month)
63                 self.cursor.execute(sql)
64                 apmr = self.cursor.fetchall()[0][0]
65
66                 #若数据库中某城市某月的天数不等于该月的实际天数,说明这个城市这个月的数据有丢失,需要构造url到备用网站获取天气数据
67                 if stmr != apmr:
68                     url = "http://www.tianqihoubao.com/lishi/%s/month/%d%02d.html" % (pin.get_pinyin(city, ''),year,month)
69                     print(stmr,apmr,'append url: %s' % url)
70                     spider.start_urls.append(url)
71
72             #citys_primary为在https://lishi.tianqi.com/支持查询的所有城市列表存,citys_backup为http://www.tianqihoubao.com/lishi/支持查询的所有城市列表
73             #理想情况下爬虫https://lishi.tianqi.com/就能获取到所有需要的数据,但是从网站抓取来的数据并非可靠,有可能某个月的某几天数据丢失,
74             #所有需要到备用网站将数据库缺失数据补齐
75             #备用网站数据集与主用网站数据集不重合,所以在做数据修复的时候要取交集
76             for city in citys_primary & citys_backup:
77                 for year in range(startyear,endyear):
78                     for month in range(1,13):
79                         check_or_appendurl(city,year,month)
80
81                 for month in range(1,endmonth):
82                     check_or_appendurl(city,endyear,month)
83
84
85     def close_spider(self, spider):
86         pass

View Code

数据实时更新

每个月都会产生新的数据,如果重新运行前面较本过于耗时,需要创建一个更新数据的蜘蛛

scrapy genspider update lishi.tianqi.com

...  

关于乱码的终极解决方案

如果发现某个页面出现乱码,用scrapy shell登入该页面,respons.text为已经解码为字符串,由于解码格式不对会显示为乱码,response.body为未解码的字节流。

python库chardet可以根据字节流预测该字节流的编码格式,给定字节流越长,判断越准确

pip install chardet  

预测response.body编码格式

In [53]: import chardet

In [54]: chardet.detect(response.body)
Out[54]: {'encoding': 'GB2312', 'confidence': 0.99, 'language': 'Chinese'}

GB2312编码的概率为99%,将response.body用GB2312格式解码,就可以看到正确的结果了。

In [63]: text = response.body.decode("GB2312")In [64]: from scrapy.selector import SelectorIn [65]: sel = Selector(text=text)In [66]: sel.css("h1::text").get()
Out[66]: '全国历史天气查询'

  

django实现单页应用

需求分析

网站需要支持如下可视化功能(city代表城市,citys代表城市集,year代表年份,years代表年份集,month代表月份。这些字段都为输入项):

  • citys-years月平均气温(曲线图)
  • city-year各天气比率(饼图)
  • citys-years-month日平均气温(曲线图)
  • city-year-month各天气比率(饼图)

效果预览:pengliangyuan.club

创建django工程和app

创建django项目

django-admin startproject webapps

创建webapp  

cd webapps
python .\manage.py startapp history_weath

django使用mysql并引入现有数据库

在前面蜘蛛已经将databaselishitianqi,tablelishitianqi插入了数据,现在将这些数据引入到django中(根据数据库表生成django模型)

修改settings.py文件

DATABASES = {'default': {'ENGINE': 'django.db.backends.mysql','NAME': 'lishitianqi',   #数据库的名字'USER': 'root','PASSWORD':'123','HOST':'localhost','PORT':'3306',}
} 

修改__init__.py

import pymysql
pymysql.install_as_MySQLdb()

执行python manage.py inspectdb命令,根据mysql数据表生成django模型

#python manage.py inspectdb
...
class Lishitianqi(models.Model):city = models.CharField(max_length=16)date = models.DateField()maxt = models.IntegerField()mint = models.IntegerField()weath = models.CharField(max_length=20)windd = models.CharField(max_length=20)windp = models.CharField(max_length=40, blank=True, null=True)class Meta:managed = Falsedb_table = 'lishitianqi'unique_together = (('city', 'date'),)

将上述模型copy到app的模型文件中,运行python manage.py makemigrations创建这些更改的迁移

#python manage.py makemigrations history_weath
Migrations for 'history_weath':history_weath\migrations\0001_initial.py- Create model Lishitianqi

运行python manage.py migrate,将这些更改应用到数据库

#python manage.py migrate
...
Operations to perform:Apply all migrations: admin, auth, contenttypes, history_weath, sessions
Running migrations:Applying history_weath.0001_initial... OK

创建城市数据库表,并标记热门城市

创建数据表citys,用于表示该webapp支持的所有城市,以及热门城市

CREATE TABLE `citys` (`city` varchar(16) NOT NULL,`hot` tinyint(1) DEFAULT 0,PRIMARY KEY (`city`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

lishitianqi中查找大所有城市,插入到citys表中,字段distinct用于去重

insert into citys(city) select distinct city from lishitianqi;

添加热门城市

update citys set hot=1 where city in ("三亚","上海","乌鲁木齐","北京","南京","厦门","合肥","天津","广州","成都","拉萨","昆明","杭州","武汉","济南","深圳","西安","贵阳","郑州"," 重庆","长沙","青岛");

生成django模型,粘贴到history_weath/models.py,并执行迁移命令(操作方式同一小节)

路由

history_weath作为一个django app,访问该app的url格式是这样的 DomainName/history_weath/xxx,

编辑webapps/urls.py

from django.contrib import admin
from django.urls import path,include
from history_weath.views import index urlpatterns = [path('', index)                           #将DomainName重定向到DomainName/history_weath,见history_weath/views.py中的index函数(后面小节会附上)path('history_weath/', include('history_weath.urls')),   #引用history_weath/urls.py,访问url = 引用url + 被引用url path('admin/', admin.site.urls),,
]

新建history_weath/urls.py

from django.urls import path
from . import viewsapp_name = 'history_weath'
urlpatterns = [path('', views.home, name='home'),               #history_weath这个app的主页path('temperature_year', views.temperature_year),      #url: DomainName/history_weath/temperature_year, 用于ajax(POST)请求citys,years的月平均气温path('temperature_month', views.temperature_month),     #url: DomainName/history_weath/temprature_month, 用于ajax(POST)请求citys,years,month日平均气温path('weath_year_ratio', views.weath_year_ratio),      #url: DomainName/history_weath/weath_year_ratio, 用于ajax(POST)请求city,year天气比率path('weath_month_ratio', views.weath_month_ratio),     #url: DomainName/history_weath/weath_month_ration, 用于ajax(POST)请求city,year,month天气比率
]

视图

history_weath为单页应用,index用于网站根url到app主页的重定向,home为单页应用的主页,其他的视图个为只专门处理ajax(POST)请求的。

history_weath/views.py

from django import views
from django.http import HttpResponse
from django.shortcuts import render, redirect,reverse
from .models import Lishitianqi, Citys
import django.utils.timezone as timezone
from xpinyin import Pinyin
import string
import json
from django.http import Http404
from django.db.models import Avg#将DomainName重定向到DomainName/history_weath
def index(request):return redirect(reverse("history_weath:home"))#history_weath这个app的主页
def home(request):hot_citys =    Citys.objects.filter(hot=1)#汉字转换拼音pin = Pinyin()#生成{'A':[], 'B':[], 'C':[] ...}, 为什么用[[] for i in range(26)]而不用[] * 26, 因为前者是浅拷贝,后者是引用citys = dict(zip(list(string.ascii_uppercase),[[] for i in range(26)])) #将城市名转换为拼音,提取首字母,按不同首字母插入到citys中for item in Citys.objects.all():city = item.cityfirst_letter = pin.get_pinyin(city)[:1].upper()citys[first_letter].append(city)#生成2011至今的年份列表years = [year for year in range(2011,timezone.now().year + 1)]context = {'hot_citys':hot_citys,'citys':citys,'years':years} #将这些数据放入到django模板渲染后显示给用户return render(request, 'history_weath/home.html', {'context':context})#url: DomainName/history_weath/temperature_year, 用于ajax(POST)请求citys,years的月平均气温
def temperature_year(request):post_data = json.loads(request.body)citys = post_data.get('citys',[])years = post_data.get('years',[])print("REQUEST:",citys,years)if citys == [] or years == []:raise Http404("Question does not exist")else:result = []for city in citys:for year in years:city_data = {'name':'%s(%s年)'%(city,year), 'data':[]}for month in range(1,13):#分别计算一个月的每天最高温度平均温度,一个月的每天最低温度平均温度month_avg = Lishitianqi.objects.filter(city=city,date__year=year,date__month=month).aggregate(Avg('maxt'),Avg('mint'))#在将一个月的每天最高温度平均温度,一个月的每天最低温度平均温度求平均mavg = round((month_avg['maxt__avg'] + month_avg['mint__avg'])/2) if month_avg['maxt__avg'] is not None else 0city_data['data'].append(mavg)result.append(city_data)#该曲线图参考https://jshare.com.cn/demos/hhhhxu,其中series.data即为ajax请求返回的数据print("RESPONSE:",result)return HttpResponse(json.dumps(result))#url: DomainName/history_weath/temprature_month, 用于ajax(POST)请求citys,years,month日平均气温
def temperature_month(request):post_data = json.loads(request.body)citys = post_data.get('citys',[])years = post_data.get('years',[])month = post_data.get('month',"")print("REQUEST",citys, years, month)if citys == [] or years == [] or month == "":raise Http404("Question does not exist")else:result = []for city in citys:for year in years:month_obj = Lishitianqi.objects.filter(city=city,date__year=year,date__month=month)#在这里,日平均温度 = 日最高温度 + 日最低温度 / 2month_data = [round((day.maxt + day.mint)/2,1) for day in month_obj]city_data = {'name':'%s(%s年%s月)'%(city,year,month), 'data':month_data}result.append(city_data)#该曲线图参考https://jshare.com.cn/demos/hhhhxu,其中series.data即为ajax请求返回的数据print("RESPONSE",result)return HttpResponse(json.dumps(result))#url: DomainName/history_weath/weath_year_ratio, 用于ajax(POST)请求city,year天气比率
def weath_year_ratio(request):post_data = json.loads(request.body)city = post_data.get('city',"")year = post_data.get('year',"")print("REQUEST",city, year)if city == "" or year == "":raise Http404("Question does not exist")else:weath_year = Lishitianqi.objects.filter(city=city,date__year=year)weath_cnts = [d.weath for d in weath_year]weath_mods = set(weath_cnts)data=[]for wd in weath_mods:cnt = weath_cnts.count(wd)data.append({'name':wd,                        #天气类型,如晴,多云,多云转晴,小雨'y': cnt / len(weath_cnts)        #该类天气在一年天气中所占比例
                })#按天气比例排序data.sort(key=lambda x:x['y'])#将比例最高天气从饼中切出data[-1]['sliced'] = 1#选择比例最高的天气data[-1]['selected'] = 1#该饼图参考https://jshare.com.cn/demos/hhhhDX,其中series.data即为ajax请求返回的数据result = {'name':'%s(%s年)'%(city,year), 'data':data}print("RESPONSE",result)return HttpResponse(json.dumps(result))#url: DomainName/history_weath/weath_month_ration, 用于ajax(POST)请求city,year,month天气比率
def weath_month_ratio(request):post_data = json.loads(request.body)city = post_data.get('city',"")year = post_data.get('year',"")month = post_data.get('month',"")print("REQUEST",city, year,month)if city == "" or year == "" or month == "":raise Http404("Question does not exist")else:weath_year = Lishitianqi.objects.filter(city=city,date__year=year,date__month=month)weath_cnts = [d.weath for d in weath_year]weath_mods = set(weath_cnts)data=[]for wd in weath_mods:cnt = weath_cnts.count(wd)data.append({'name':wd,                    #天气类型,如晴,多云,多云转晴,小雨'y': cnt / len(weath_cnts)    #该类天气在一个月天气中所占比例
                })#按天气比例排序data.sort(key=lambda x:x['y'])#将比例最高天气从饼中切出data[-1]['sliced'] = 1#选择比例最高的天气data[-1]['selected'] = 1#该饼图参考https://jshare.com.cn/demos/hhhhDX,其中series.data即为ajax请求返回的数据result = {'name':'%s(%s年%s月)'%(city,year,month), 'data':data}print("RESPONSE",result)return HttpResponse(json.dumps(result))

View Code

模板

创建目录

history_weath\templates\history_weath  

由于是单页应用,只有home.html。

home.html

 load staticfiles <!doctype html>
<html lang="en"><head><script src=" static 'history_weath/jquery.min.js' "></script><link href=" static 'history_weath/bootstrap.min.css' " rel="stylesheet"><link href=" static 'history_weath/home.css' " rel="stylesheet"></head><body><nav class="navbar navbar-dark fixed-top bg-dark flex-md-nowrap p-0 shadow"><a class="navbar-brand col-sm-3 col-md-2 mr-0" href=" url 'history_weath:home' ">全国各城市天气可视化系统</a></nav><div class="container-fluid"><div class="row"><nav class="col-md-2 d-none d-md-block bg-light sidebar"><div class="sidebar-sticky"><ul class="nav flex-column"><li class="nav-item "><a class="active nav-link" id="page1_index" href="#"><span data-feather="bar-chart-2"></span>数据可视化</a></li><li class="nav-item "><a class="nav-link" href="#" id="page2_index"><span data-feather="layers"></span>数据分析</a></li></ul></div></nav><main role="main" class="col-md-9 ml-sm-auto col-lg-10 px-4" style="padding-top:70px"><div id="page1"><div clsss="container"><div class="row"><div class="col-md-3"><div class="input-group mb-3"><div class="input-group-prepend" id="page1_iter"><button type="button" class="btn btn-outline-secondary" id="page2_city">城市</button><button type="button" class="btn btn-outline-secondary dropdown-toggle dropdown-toggle-split" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"><span class="sr-only">Toggle Dropdown</span></button><div class="dropdown-menu"> for city in context.hot_citys <button class="dropdown-item page1_add_city">{{ city.city }}</button> endfor  for k,v in context.citys.items <div role="separator" class="dropdown-divider"></div><p class="text-center">{{ k }}</p> for city in v <button class="dropdown-item page1_add_city">{{ city }}</button> endfor  endfor </div></div><input type="text" class="form-control" aria-label="Text input with segmented dropdown button"></div><div class="input-group mb-3"><div class="input-group-prepend" id="page1_iter"><button type="button" class="btn btn-outline-secondary" id="page1_year">年份</button><button type="button" class="btn btn-outline-secondary dropdown-toggle dropdown-toggle-split" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"><span class="sr-only">Toggle Dropdown</span></button><div class="dropdown-menu"> for year in context.years <button class="dropdown-item page1_add_year">{{ year }}</button> endfor </div></div></div></div><div class="col-md-1 "><div class="card " style='height: 100%; background-color: #e5e5e5'><span class="badge badge-info page1_city_list" ></span><div id="page1_city_label"></div></div></div><div class="col-md-1"><div class="card " style='height: 100%; background-color: #e5e5e5'><span class="badge badge-success page1_year_list" ></span><div id="page1_year_label"></div></div></div></div></div><button type="button" class="btn btn-secondary btn-lg btn-block" style="margin-top: 10px" id="page1_send">开始</button><div id="page1_year_content" style="display: none"><div id="page1_chart1" style="min-width:400px;height:400px;width: 100%"></div><div class="input-group mb-3"><div class="input-group-prepend" id="page1_iter"><button type="button" class="btn btn-outline-secondary" id="page1_year_weath">气候</button><button type="button" class="btn btn-outline-secondary dropdown-toggle dropdown-toggle-split" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"><span class="sr-only">Toggle Dropdown</span></button><div class="dropdown-menu"><button class="dropdown-item page1_add_year_weath">关闭</button><div id="page1_year_weath_label"></div></div></div></div><div id="page1_chart3" style="min-width:400px;height:400px;width: 100%; display: none;"></div><div class="btn-group mr-2" role="group" aria-label="First group" style="width: 100%"><button type="button" class="btn btn-secondary page1_month" month=1>一月</button><button type="button" class="btn btn-secondary page1_month" month=2>二月</button><button type="button" class="btn btn-secondary page1_month" month=3>三月</button><button type="button" class="btn btn-secondary page1_month" month=4>四月</button><button type="button" class="btn btn-secondary page1_month" month=5>五月</button><button type="button" class="btn btn-secondary page1_month" month=6>六月</button><button type="button" class="btn btn-secondary page1_month" month=7>七月</button><button type="button" class="btn btn-secondary page1_month" month=8>八月</button><button type="button" class="btn btn-secondary page1_month" month=9>九月</button><button type="button" class="btn btn-secondary page1_month" month=10>十月</button><button type="button" class="btn btn-secondary page1_month" month=11>十一月</button><button type="button" class="btn btn-secondary page1_month" month=12>十二月</button></div><div id="page1_month_content" style="display: none"><div id="page1_chart2" style="min-width:400px;height:400px;width: 100%"></div><div class="input-group mb-3"><div class="input-group-prepend" id="page1_iter"><button type="button" class="btn btn-outline-secondary" id="page1_month_weath">气候</button><button type="button" class="btn btn-outline-secondary dropdown-toggle dropdown-toggle-split" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"><span class="sr-only">Toggle Dropdown</span></button><div class="dropdown-menu"><button class="dropdown-item page1_add_month_weath">关闭</button><div id="page1_month_weath_label"></div></div></div></div><div id="page1_chart4" style="min-width:400px;height:400px;width: 100%; display: none;"></div></div></div></div><div id="page2" style="display: none"></div></main></div>
</div><script src=" static 'history_weath/highcharts.js' "></script><script src=" static 'history_weath/exporting.js' "></script><script src=" static 'history_weath/oldie.js' "></script><script src=" static 'history_weath/highcharts-zh_CN.js' "></script><script src=" static 'history_weath/popper.min.js' "></script><script src=" static 'history_weath/bootstrap.min.js' " ></script><script src=" static 'history_weath/feather.min.js' "></script><script src=" static 'history_weath/home.js' "></script></body>
</html>

View Code

静态文件

创建目录

history_weath\static\history_weath

home.css

body {font-size: .875rem;
}.feather {width: 16px;height: 16px;vertical-align: text-bottom;
}/** Sidebar*/.sidebar {position: fixed;top: 0;bottom: 0;left: 0;z-index: 100; /* Behind the navbar */padding: 48px 0 0; /* Height of navbar */box-shadow: inset -1px 0 0 rgba(0, 0, 0, .1);
}.sidebar-sticky {position: relative;top: 0;height: calc(100vh - 48px);padding-top: .5rem;overflow-x: hidden;overflow-y: auto; /* Scrollable contents if viewport is shorter than content. */
}@supports ((position: -webkit-sticky) or (position: sticky)) {.sidebar-sticky {position: -webkit-sticky;position: sticky;}
}.sidebar .nav-link {font-weight: 500;color: #333;
}.sidebar .nav-link .feather {margin-right: 4px;color: #999;
}.sidebar .nav-link.active {color: #007bff;
}.sidebar .nav-link:hover .feather,
.sidebar .nav-link.active .feather {color: inherit;
}.sidebar-heading {font-size: .75rem;text-transform: uppercase;
}/** Content*/[role="main"] {padding-top: 133px; /* Space for fixed navbar */
}@media (min-width: 768px) {[role="main"] {padding-top: 48px; /* Space for fixed navbar */}
}/** Navbar*/.navbar-brand {padding-top: .75rem;padding-bottom: .75rem;font-size: 1rem;background-color: rgba(0, 0, 0, .25);box-shadow: inset -1px 0 0 rgba(0, 0, 0, .25);
}.navbar .form-control {padding: .75rem 1rem;border-width: 0;border-radius: 0;
}.form-control-dark {color: #fff;background-color: rgba(255, 255, 255, .1);border-color: rgba(255, 255, 255, .1);
}.form-control-dark:focus {border-color: transparent;box-shadow: 0 0 0 3px rgba(255, 255, 255, .25);
}

View Code

home.js

(function () {'use strict'feather.replace()var page1_chart1 = Highcharts.chart('page1_chart1', {chart: {type: 'line'},title: {text: ''},subtitle: {text: ''},xAxis: {categories: ['一月', '二月', '三月', '四月', '五月', '六月', '七月', '八月', '九月', '十月', '十一月', '十二月']},yAxis: {title: {text: '气温 (°C)'}},plotOptions: {line: {dataLabels: {// 开启数据标签enabled: true          },// 关闭鼠标跟踪,对应的提示框、点击事件会失效enableMouseTracking: false}},series: []
});var page1_chart2 = Highcharts.chart('page1_chart2', {chart: {type: 'line'},title: {text: ''},subtitle: {text: ''},xAxis: {categories: [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31]},yAxis: {title: {text: '气温 (°C)'}},plotOptions: {line: {dataLabels: {// 开启数据标签enabled: true          },// 关闭鼠标跟踪,对应的提示框、点击事件会失效enableMouseTracking: false}},series: []
});var page1_chart3 = Highcharts.chart('page1_chart3', {chart: {plotBackgroundColor: null,plotBorderWidth: null,plotShadow: false,type: 'pie'},title: {text: ''},tooltip: {pointFormat: '{series.name}: <b>{point.percentage:.1f}%</b>'},plotOptions: {pie: {allowPointSelect: true,cursor: 'pointer',dataLabels: {enabled: true,format: '<b>{point.name}</b>: {point.percentage:.1f} %',style: {color: (Highcharts.theme && Highcharts.theme.contrastTextColor) || 'black'}}}},series: []
});var page1_chart4 = Highcharts.chart('page1_chart4', {chart: {plotBackgroundColor: null,plotBorderWidth: null,plotShadow: false,type: 'pie'},title: {text: ''},tooltip: {pointFormat: '{series.name}: <b>{point.percentage:.1f}%</b>'},plotOptions: {pie: {allowPointSelect: true,cursor: 'pointer',dataLabels: {enabled: true,format: '<b>{point.name}</b>: {point.percentage:.1f} %',style: {color: (Highcharts.theme && Highcharts.theme.contrastTextColor) || 'black'}}}},series: []
});$("#page1_index").click(function(){$("#page1").show()$("#page2").hide()$("#page2_index").removeClass("active")$("#page1_index").addClass("active")
});$("#page2_index").click(function(){$("#page2").show()$("#page1").hide()$("#page1_index").removeClass("active")$("#page2_index").addClass("active")
});$(".page1_add_city").click(function(){if($(".page1_city_list").text().indexOf($(this).text()) == -1){var insert = $(".page1_city_list").first().clone().text($(this).text())insert.bind("click", function(){$(this).remove()})$("#page1_city_label").before(insert)}
});$(".page1_add_year").click(function(){if($(".page1_year_list").text().indexOf($(this).text()) == -1){var insert = $(".page1_year_list").first().clone().text($(this).text())insert.bind("click", function(){$(this).remove()})$("#page1_year_label").before(insert)}
});$(".page1_add_year_weath").first().click(function(){$("#page1_chart3").hide()$("#page1_year_weath").text("气候")
});$(".page1_add_month_weath").first().click(function(){$("#page1_chart4").hide()$("#page1_month_weath").text("气候")
});$("#page1_send").click(function(){var city_list = $(".page1_city_list")var citys = []for(var i = 1,p = city_list.first().next(); i < city_list.length; i++, p=p.next())citys[i-1] = p.text()var year_list = $(".page1_year_list")var years = []for(var i = 1,p = year_list.first().next(); i < year_list.length; i++, p=p.next())years[i-1] = p.text()if(citys.length > 0 && years.length > 0){//删除for(var z = $(".page1_add_year_weath").first().next(); z.text() != ""; z = $(".page1_add_year_weath").first().next()){z.remove()}for(var i = 0; i < citys.length; i++){for(var j = 0; j < years.length; j++){var city = citys[i];var year = years[j];var text = city + year + "年各种天气所占比例"//添加var insert = $(".page1_add_year_weath").first().clone().text(text)insert.bind("click", function(){$("#page1_chart3").show()$("#page1_year_weath").text($(this).text())$.ajax({type:"POST",url:"/history_weath/weath_year_ratio",data:JSON.stringify({city:city, year:year}),success:function(result){//删除数据var len = page1_chart3.series.length;for(var i = len - 1; i >= 0; i--){page1_chart3.series[i].remove();}//新增数据var data = JSON.parse(result)page1_chart3.addSeries(data);var title = city + "-" + year +"年各种天气所占比例饼图";page1_chart3.setTitle({text:title,useHTML:true})}});})$("#page1_year_weath_label").before(insert)}}$.ajax({type:"POST",url:"/history_weath/temperature_year",data:JSON.stringify({citys:citys, years:years}),success:function(result){//删除数据var len = page1_chart1.series.length;for(var i = len - 1; i >= 0; i--){page1_chart1.series[i].remove();}//新增数据var data = JSON.parse(result)for(var i = 0; i < data.length; i++){page1_chart1.addSeries(data[i]);}var title = citys.toString() + "-" + years.toString() +"年月平均气温曲线图";page1_chart1.setTitle({text:title,useHTML:true})$("#page1_year_content").show()    }});}
});$(".page1_month").click(function(){var city_list = $(".page1_city_list")var citys = []for(var i = 1,p = city_list.first().next(); i < city_list.length; i++, p=p.next())citys[i-1] = p.text()var year_list = $(".page1_year_list")var years = []for(var i = 1,p = year_list.first().next(); i < year_list.length; i++, p=p.next())years[i-1] = p.text()var month = $(this).attr('month')var month_name = $(this).text()if(citys.length > 0 && years.length > 0){//删除for(var z = $(".page1_add_month_weath").first().next(); z.text() != "";z = $(".page1_add_month_weath").first().next()){z.remove()}for(var i = 0; i < citys.length; i++){for(var j = 0; j < years.length; j++){var city = citys[i];var year = years[j];var text = city + year + "年" + month + "月各种天气所占比例"//添加var insert = $(".page1_add_month_weath").first().clone().text(text)insert.bind("click", function(){$("#page1_chart4").show()$("#page1_month_weath").text($(this).text())$.ajax({type:"POST",url:"/history_weath/weath_month_ratio",data:JSON.stringify({city:city, year:year, month:month}),success:function(result){//删除数据var len = page1_chart4.series.length;for(var i = len - 1; i >= 0; i--){page1_chart4.series[i].remove();}//新增数据var data = JSON.parse(result)page1_chart4.addSeries(data);var title = city + "-" + year +"年" + month + "月" + "各种天气所占比例饼图";page1_chart4.setTitle({text:title,useHTML:true})}});})$("#page1_month_weath_label").before(insert)}}$.ajax({type:"POST",url:"/history_weath/temperature_month",data: JSON.stringify({citys:citys, years:years, month:month}),dataType:'json',success:function(result){//删除数据var len = page1_chart2.series.length;for(var i = len - 1; i >= 0; i--){page1_chart2.series[i].remove();}//新增数据var data = resultfor(var i = 0; i < data.length; i++){page1_chart2.addSeries(data[i]);}var title = citys.toString() + "-" + years.toString() +"年" + month  + "月日平均气温曲线图";page1_chart2.setTitle({text:title,useHTML:true})$("#page1_month_content").show()    }});}});}())

View Code

jquery.min.js

popper.min.js

bootstrap.min.css

bootstrap.min.js

feather.min.js

highcharts.js

highcharts-zh_CN.js

oldie.js

exporting.js

关于django使用mysql报错的问题

错误:django.core.exceptions.ImproperlyConfigured: mysqlclient 1.3.13 or newer is required; you have 0.9.3.

解决办法:修改python库路径下/Lib/site-packages/django/db/backends/mysql的base.py文件,注释掉版本低于1.3.13抛出错误这两行代码

#if version < (1, 3, 13):
#    raise ImproperlyConfigured('mysqlclient 1.3.13 or newer is required; you have %s.' % Database.__version__)

错误:AttributeError: 'str' object has no attribute 'decode'

解决办法:修改python库路径下/Lib/site-packages/django/db/backends/mysql的operations.py文件,判断query为bytes类型时才执行decode

        if query is not None:if isinstance(query,bytes):query = query.decode(errors='replace')return query 

django,uWSGI,nginx部署

前面的部分都是在windows上完成的开发,现在需要部署到个人云服务器

mysql数据库备份与还原

现在要将前面爬虫花了9小时爬取的数据从个人电脑迁移到云服务器。

数据库备份

mysqldump -u root -p lishitianqi > d:\backup.sql

将backup.sql拷贝的云服务器,数据库还原

MariaDB [(none)]> create database lishitianqi;
...
[root@VM_0_12_centos mysql]# mysql -u root -p --default-character-set=utf8 lishitianqi <  backup.sql

django部署清单

检查settings.py

 python manage.py check --deploy
System check identified some issues:WARNINGS:
?: (security.W003) You don't appear to be using Django's built-in cross-site request forgery protection via the middleware ('django.middleware.csrf.CsrfViewMiddleware' is not in your MIDDLEWARE). Enabling the middleware is the safest approach to ensure you don't leave any holes.
?: (security.W004) You have not set a value for the SECURE_HSTS_SECONDS setting. If your entire site is served only over SSL, you may want to consider setting a value and enabling HTTP Strict Transport Security. Be sure to read the documentation first; enabling HSTS carelessly can cause serious, irreversible problems.
?: (security.W006) Your SECURE_CONTENT_TYPE_NOSNIFF setting is not set to True, so your pages will not be served with an 'x-content-type-options: nosniff' header. You should consider enabling this header to prevent the browser from identifying content types incorrectly.
?: (security.W007) Your SECURE_BROWSER_XSS_FILTER setting is not set to True, so your pages will not be served with an 'x-xss-protection: 1; mode=block' header. You should consider enabling this header to activate the browser's XSS filtering and help prevent XSS attacks.
?: (security.W008) Your SECURE_SSL_REDIRECT setting is not set to True. Unless your site should be available over both SSL and non-SSL connections, you may want to either set this setting True or configure a load balancer or reverse-proxy server to redirect all connections to HTTPS.
?: (security.W012) SESSION_COOKIE_SECURE is not set to True. Using a secure-only session cookie makes it more difficult for network traffic sniffers to hijack user sessions.
?: (security.W018) You should not have DEBUG set to True in deployment.
?: (security.W019) You have 'django.middleware.clickjacking.XFrameOptionsMiddleware' in your MIDDLEWARE, but X_FRAME_OPTIONS is not set to 'DENY'. The default is 'SAMEORIGIN', but unless there is a good reason for your site to serve other parts of itself in a frame, you should change it to 'DENY'.System check identified 8 issues (0 silenced).

按照上述描述的问题一一解决

python虚拟环境

pip install virtualenv
pip install virtualenvwrapper#创建目录用来存放虚拟环境
mkdir $HOME/.virtualenvs#在~/.bashrc中添加行
export WORKON_HOME=$HOME/.virtualenvs
export VIRTUALENVWRAPPER_VIRTUALENV=/usr/local/python3/bin/virtualenv
source /usr/local/python3/bin/virtualenvwrapper.sh#让配置文件生效
source ~/.bashrc#现在mkvirtualenv, workon, deactive,rmvirtualenv命令可以用了#创建虚拟环境
mkvirtualenv -p python3 django_env_py3#进入虚拟环境
workon django_env_py3#退出虚拟环境
deactivate

#删除虚拟环境revirtualenv django_env_py3

工程迁移

进入django虚拟环境

workon django_env_py3

将此工程拷贝到云服务器,安装依赖库,修改连接数据库参数:主机(localhost需要改为127.0.0.1),端口号,用户名,密码,让工程能正常运行。关于连接mysql Error问题上文提到过。

spider/lishitianqi/pipelines.py

                #连接数据库self.connect = pymysql.connect(host='127.0.0.1',port=8809,db='lishitianqi',user='root',passwd='123',charset='utf8',use_unicode=True)

webapps/webapps/settings.py

DATABASES = {'default': {'ENGINE': 'django.db.backends.mysql','NAME': 'lishitianqi','USER': 'root','PASSWORD':'123','HOST':'127.0.0.1','PORT':'8809',}
}

生成requirements.txt

pip freeze > requirements.txt

asn1crypto==0.24.0
attrs==19.1.0
Automat==0.7.0
cffi==1.12.3
constantly==15.1.0
cryptography==2.7
cssselect==1.1.0
Django==2.2.5
hyperlink==19.0.0
idna==2.8
incremental==17.5.0
lxml==4.4.1
parsel==1.5.2
pyasn1==0.4.7
pyasn1-modules==0.2.6
pycparser==2.19
PyDispatcher==2.0.5
PyHamcrest==1.9.0
PyMySQL==0.9.3
pyOpenSSL==19.0.0
pytz==2019.2
queuelib==1.5.0
Scrapy==1.7.3
service-identity==18.1.0
six==1.12.0
sqlparse==0.3.0
Twisted==19.7.0
uWSGI==2.0.18
w3lib==1.21.0
xpinyin==0.5.6
zope.interface==4.6.0

通过requirements.txt安装python库

pip install -r requirements.txt 

uWSIG服务器部署

安装uwsgi

pip install uwsgi

在django项目的setting.py文件的同级目录下,增加一个配置文件uwsgi.ini

[uwsgi]
#使用nginx连接时使用
socket=127.0.0.1:9090
#直接做web服务器使用
#http=127.0.0.1:9090
#项目目录
chdir=/root/project/fullstack_history_weath/webapps
#项目中wsgi.py文件的目录,相对于项目目录
wsgi-file=webapps/wsgi.py
# 指定启动的工作进程数
processes=4
# 指定工作进程中的线程数
threads=2
master=True
# 保存启动之后主进程的pid
pidfile=uwsgi.pid
# 设置uwsgi后台运行,用uwsgi.log保存日志信息
daemonize=uwsgi.log
# 设置虚拟环境的路径
virtualenv=/root/.virtualenvs/django_env_py3

uwsgi相关命令

#uWSGI 通过 xxx.ini 启动后会在相同目录下生成一个 xxx.pid 的文件,里面只有一行内容是 uWSGI 的主进程的进程号。#uWSGI 启动:
uwsgi --ini xxx.ini#uwsgi 重启:
uwsgi --reload xxx.pid#uwsgi 停止:
uwsgi --stop xxx.pid

#强制关闭pkill uwsgi

nginx服务器部署

安装nginx

wget http://nginx.org/download/nginx-1.8.0.tar.gz
tar -xvf nginx-1.8.0.tar.gz
cd nginx-1.8.0
./configure
makemake install  ln -s /usr/local/nginx/sbin/nginx /usr/bin/nginx

运行nginx

/usr/local/nginx/sbin/nginx

打开浏览器访问此机器的 IP,如果浏览器出现 Welcome to nginx! 则表示 Nginx 已经安装并运行成功。

nginx相关命令:

#重启:
nginx –s reload#停止:
nginx –s stop#测试配置文件是否正常:
nginx –t#强制关闭:
pkill nginx

配置nginx

        location / {include  uwsgi_params;uwsgi_pass  127.0.0.1:9090;}location /static/ {autoindex on;alias  /root/project/fullstack_history_weath/webapps/static/;}

  

部署出现的问题 

问题:uwsgi日志报错“ModuleNotFoundError: No module named 'encodings'”

解决办法:uwsgi配置文件中虚拟环境路径设置错误,可以通过

In [1]: import sysIn [2]: sys.path
Out[2]:
['/root/.virtualenvs/django_env_py3/lib/python3.6/site-packages','/usr/local/python3/bin','/usr/local/python3/lib/python36.zip','/usr/local/python3/lib/python3.6','/usr/local/python3/lib/python3.6/lib-dynload','','/usr/local/python3/lib/python3.6/site-packages','/usr/local/python3/lib/python3.6/site-packages/IPython/extensions','/root/.ipython']

因为我的虚拟环境名为django_env_py3,所以

# 设置虚拟环境的路径
virtualenv=/root/.virtualenvs/django_env_py3

问题:uwsgi日志报错.“--- no python application found, check your startup logs for errors ---”

问题:uwsgi日志报错“chdir(): No such file or directory [core/uwsgi.c line 2623]”

解决办法: 上面两个问题第一个是没有设置chdir,第二个是chdir路径设置错误导致,正确写法如下

#django项目根目录
chdir=/root/project/fullstack_history_weath/webapps

  

问题:nginx能加载html,加载静态文件显示 404(Not Found)

可能的原因:

1. alias  /root/project/fullstack_history_weath/webapps/static末尾没加/会导致

2. python setting.py中的STATIC_ROOT和nginx配置文件中静态文件规则location /static/上的路径不一致

3. python没有执行命令python manage.py collectstatic收集所有静态文件到STATIC_ROOT路径下

问题:nginx能加载html,加载静态文件显示403(Forbidden)

解决方法:静态文件以及他们的所有父目录权限不足,例如有静态文件路径/root/project/fullstack_history_weath/webapps/static/,若目录project的权限为644,

static中所有的文件权限为644,该路径中的其余文件夹(root, fullstack_history_weath,webapps)权限为655,则需要做如下修改

chmod 655 /root/project
chmod -R 655 /root/project/fullstack_history_weath/webapps/static

项目代码获取

至此,整个项目已经全部完成,完整代码获取见下面连接。

nginx文件下载:http://pengliangyuan.club/files/Chinese-historical-weather-visualization-system/

github连接:https://github.com/pengliangyuan/Chinese-historical-weather-visualization-system

番外篇-nginx目录美化

  1. 给nginx添加fanyindex插件,下载源码ngx-fancyindex
  2. 将源码解压后放在nginx源码目录下
  3. 在编译选项中添加这个插件./configure --add-module=ngx-fancyindex-0.4.3(插件解压后的目录名)
  4. makemake installln -s /usr/local/nginx/sbin/nginx /usr/bin/nginx
  5. 在目录/usr/local/nginx/html下,下载fancyindex主题git clone https://github.com/TheInsomniac/Nginx-Fancyindex-Theme.git
  6. 修改/usr/local/nginx/conf/nginx.conf,配置如下,nginx -s reload
        location /files {include /usr/local/nginx/html/Nginx-Fancyindex-Theme/fancyindex.conf; alias  /root/files/;charset utf-8,gbk; }location /Nginx-Fancyindex-Theme/ {alias /usr/local/nginx/html/Nginx-Fancyindex-Theme/;}

  

转载于:https://www.cnblogs.com/plyonfly/p/11493115.html

python全栈(中国历史天气可视化系统)相关推荐

  1. 如何快速成长python全栈工程师

    人工智能时代,想要从事编程行业,最佳的学习语言是什么?当然是Python!Python入门简单.功能强大,已成为各大企业首选开发语言,也吸引了无数有志学子投身学习.下面是小编为您整理的关于python ...

  2. 2017最新整理python全栈工程师系统培训之路精品课程(全套)

    百度搜索"一起自学吧"做大数据专家,不做第一,只做唯一. 课程介绍: 本课程由一起自学吧论坛打造,目的是给想成为Python全栈工程师的同学提供一套完整,全面而系统的培训课程,课程 ...

  3. 视频教程-Python全栈9期(第八部分):CRM开发之curd组件和权限系统-Python

    Python全栈9期(第八部分):CRM开发之curd组件和权限系统 TriAquae开源运维软件创始人,混迹IT运维领域多年,曾就职于松下.国政通.飞信.中金.NOKIA等公司,维护过少至几十台,多 ...

  4. Python全栈9期(第八部分):CRM开发之curd组件和权限系统-李杰-专题视频课程

    Python全栈9期(第八部分):CRM开发之curd组件和权限系统-45人已学习 课程介绍         学到的不仅仅只是Python,还有培养编程思想! 2018年全套视频<Python全 ...

  5. PythonStock(37)股票系统:Python股票系统发布V2.0版本,改个名字吧,叫Python全栈股票系统2.0,可以实现数据的抓取(akshare),统计分析,数据报表展示。

    目录 前言 1,关于Python全栈股票系统V2.0 2,在CSDN上居然有人给代码打包收费下载!! 2,更新docker镜像 3,总结 前言 使用Python开发一个web股票项目. [github ...

  6. python 全栈开发,Day112(内容回顾,单例模式,路由系统,stark组件)

    python 全栈开发,Day112(内容回顾,单例模式,路由系统,stark组件) 一.内容回顾 类可否作为字典的key 初级 举例: class Foo(object):pass_registry ...

  7. PythonStock(38)全栈股票系统:Python全栈股票系统,使用vue进行前端界面开发,修改接口可以展示股票数据,持续开发中。

    目录 前言 0,关于Python全栈股票系统2.0,github,5.8K星星 1,Python全栈股票系统,更改使用elementUI开发 2,使用vue-element-admin,进行开发 3, ...

  8. 视频教程-Python全栈9期(第十部分):CRM系统-Python

    Python全栈9期(第十部分):CRM系统 TriAquae开源运维软件创始人,混迹IT运维领域多年,曾就职于松下.国政通.飞信.中金.NOKIA等公司,维护过少至几十台,多至数万台设备的IT系统, ...

  9. Python全栈9期(第十部分):CRM系统-李杰-专题视频课程

    Python全栈9期(第十部分):CRM系统-143人已学习 课程介绍         学到的不仅仅只是Python,还有培养编程思想! 2018年全套视频<Python全栈9期>.内容涵 ...

最新文章

  1. [简单题]Counting Duplicates( Python 实现)
  2. Python之递归函数
  3. NGINX 进程通信机制
  4. Agile in a Flash:重新定义纪律
  5. Docker小白到实战之Docker Compose在手,一键足矣
  6. 服气!蹭热点无敌,麦当劳发布“5G”新品:真香
  7. 程序员除了转行做管理,还可以做什么?
  8. lxc设置网络为公网IP
  9. 计算机二级C语言程序设计 第一章 程序设计基本概念
  10. Yann LeCun、Yoshua Bengio和Geoffrey Hinton的联合综述《Deep Learning》翻译
  11. 2020年开发人员的机器学习基础
  12. python调用R语言
  13. idea 断点debug没有勾勾,导致调试无效
  14. linux tty字体,ArchLinux TTY 中文字体渲染
  15. C语言函数-形参和实参
  16. 座舱人机交互「暗潮汹涌」,语音「下」,多模态「上」
  17. java四则运算思路_java四则运算
  18. pytorch中nn.Embedding和nn.LSTM和nn.Linear
  19. OpenCV 学习笔记03 凸包convexHull、道格拉斯-普克算法Douglas-Peucker algorithm、approxPloyDP 函数...
  20. Contest3412 - 2022中石油大中小学生联合训练第七场

热门文章

  1. python添加高斯噪声_使用Python-OpenCV向图片添加噪声(高斯噪声、椒盐噪声)
  2. 如何让你的代码变得更优雅?这些代码规范和技巧必须知道(进阶必备,建议收藏)
  3. Centos下jira、confluence利用nginx 代理配置为统一入口
  4. NeuralProphet之六:多元时间序列预测
  5. 3.写出一个程序,接受一个十六进制的数值字符串,输出该数值的十进制字符串。(多组同时输入 )
  6. TreeSet的比较器简单的分析和使用(TreeMap同样适用)
  7. macOS和Linux下source和export命令
  8. 水滴IPO,揭秘商业模式背后的故事
  9. 关于三星设备 Activity.onDestroy() 被调用。显示“开发者选项”
  10. 安卓原生应用开发!一起刷完了这份1307页的安卓面试宝典吧,值得收藏!