python爬虫获取双色球历史中奖纪录写入数据库

from datetime import datetime
import pymysql
import requests
import time
import re
import urllib.request
import os
import json
from bs4 import BeautifulSoup
#获取资源的路径
#url='http://repo1.maven.org/maven2/HTTPClient/'
#存放到路径
pathFinal="C:\\Users\\gao\\Desktop\\mysqlConnectionDown"
url="https://datachart.500.com/ssq/history/newinc/history.php?limit=100&sort=0"
url2 ="http://datachart.500.com/ssq/history/newinc/history.php?start=1001&end=19019"
content = requests.get(url2,timeout = 500 )
content = content.text
# print(content)
#content = content.replace("var libs =","")
#print(json.loads(content))
soup = BeautifulSoup(content, 'html.parser')
content=soup.find(id='tdata')
# print(content)
trs = content.find_all('tr')
conn = pymysql.connect(host='localhost', user='root',password='123456',database='aaa',charset='utf8')
cursor = conn.cursor()
# trs = str(trs)
# trs = trs.replace(' <tr class="tdbck"><td colspan="51"></td></tr>, ','')
# #print(trs)
# trs = BeautifulSoup(trs, 'html.parser')
# print(type(trs))
for tr in trs :
#     if str(tr) !='<tr class="tdbck"><td colspan="51"></td></tr>':
## print(trs[0])idre = re.compile('<tr class="t_tr1"><!--<td>2</td>--><td>(.*?)</td>')id = re.findall(idre, str(tr))# print(id[0])id= id[0]hongre = re.compile('<td class="t_cfont2">(.*?)</td>')hongs = re.findall(hongre, str(tr))# for hong in hongs:hong1=hongs[0]hong2=hongs[1]hong3=hongs[2]hong4=hongs[3]hong5=hongs[4]hong6=hongs[5]lanre = re.compile('<td class="t_cfont4">(.*?)</td>')lan = re.findall(lanre, str(tr))print(lan[0])riqire = re.compile('</td><td>([^<]*?)</td></tr>')riqi = re.findall(riqire, str(tr))print(riqi[0])print('-----------')# sql = "INSERT INTO caipiao(qi) VALUES ( %d);"sql = "INSERT INTO caipiao(qi,hong1,hong2,hong3,hong4,hong5,hong6,lan,riqi) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,'%s');" %(id,hong1,hong2,hong3,hong4,hong5,hong6,lan[0],datetime.strptime(riqi[0], "%Y-%m-%d"))# 执行SQL语句print(id)print(type(id))# print(hong1)# print(type(hong1))# print(hong2)# print(hong3)# print(hong4)# print(hong5)# print(hong6)# print(lan[0])# print(type(lan[0]))# print(datetime.strptime(riqi[0], "%Y-%m-%d"))# print(type(datetime.strptime(riqi[0], "%Y-%m-%d")))# #datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")cursor.execute(sql)conn.commit()#print(content)# content= json.loads(content)
# print(len(content))
# for i in content:
#     print("geturl(\"http://cdn.code.baidu.com/v/"+i+"\")")
## 提交事务cursor.close()
conn.close()

python爬虫获取双色球历史中奖纪录写入数据库相关推荐

python如何过获取双色球信息_【编程】Python爬虫获取双色球数据
#爬虫获取双色球的全部开奖数据 #使用class, #格式: import urllib.request import platform from bs4 import BeautifulSoup i ...
利用python数据分析，获取双色球历史中奖信息！（内含详细代码）
前言: 毫无例外,基本上是所有人都有一颗中奖的心,不管是有钱的,还是没钱的!你们说对吗? 对于技术人员来说,通过技术分析,可以增加中奖几率,现使用python语言收集历史双色球中奖信息,之后进行预测分 ...
通过fsockopen()方法从中国福彩网获取双色球历史中奖数据
public function history_draw() {$fp = fsockopen('www.cwl.gov.cn', 80, $errno, $errstr, 60) or die('f ...
python 使用 selenium 爬取中国福利彩票双色球历史中奖号码
python 使用 selenium 爬取中国福利彩票双色球历史中奖号码前期准备版本:python3 模块:selenium.time.pprint 一开始使用的是 tree 的方式获取数据,但发 ...
Python爬虫获取文章的标题及你的博客的阅读量，评论量。所有数据写入本地记事本。最后输出你的总阅读量！
Python爬虫获取文章的标题及你的博客的阅读量,评论量.所有数据写入本地记事本.最后输出你的总阅读量!还可以进行筛选输出!比如阅读量大于1000,之类的! 完整代码在最后.依据阅读数量进行降序输出! ...
Python 爬虫 | 获取历史涨停数据
最近想研究一下连板的个股有没有什么规律(暴富我来了),这一篇主要就是做数据准备的. 目录 1.数据获取 2.代码实现 1.数据获取数据来源就是问财,查指定日期的涨停即可获得当日的数据. 2.代码实现 ...
python爬虫获取服务器信息,通过python自动化获取服务器信息，并写入到excel(示例代码)...
简介这篇文章主要介绍了通过python自动化获取服务器信息,并写入到excel(示例代码)以及相关的经验技巧,文章约943字,浏览量170,点赞数4,值得参考! 博主目前在电信外包工作,比较坑,因为涉 ...
python爬虫获取基金数据2
用sklearn分析基金数据<1> python爬虫获取基金数据<2> 数据预处理:数据清洗.生成样本数据<3> 用sklearn训练样本数据<4> 用 ...
编写python爬虫获取中华英才网全网工资数据
做数据分析数据挖掘,第一步是获取数据,在这里,我们要分析现今全国各地各个职业的工资情况. 我们选择较为权威的'中华英才网',编写python爬虫获取该网站上的各个招聘信息说给出的工资,再取其行业工资 ...

python爬虫获取双色球历史中奖纪录写入数据库

python爬虫获取双色球历史中奖纪录写入数据库相关推荐

最新文章

热门文章