python 接入百度地图数据包下载_Python爬虫-利用百度地图API接口爬取数据并保存至MySQL数据库...

首先，我这里有一份相关城市以及该城市的公园数量的txt文件：

分析-02.png

其次，利用百度地图API提供的接口爬取城市公园的相关信息。

所利用的API接口有两个：

1、http://api.map.baidu.com/place/v2/search?q=公园&region=北京&output=json&ak=用户的访问密匙

2、http://api.map.baidu.com/place/v2/detail?uid=xxxxx&output=json&scope=2&ak=用户的访问密匙

第一个API接口可以获得城市公园的一般信息

第二个API接口可以获得城市公园的详细信息

参数说明：

q:检索的关键字

region:检索的区域(市级以上)

page_size:每一页的记录数量

page_num:分页页码

output:输出格式json/xml

ak:用户的访问密钥，可以在百度地图API平台上进行申请

一、尝试第一个API获取数据并存储至MySQL数据库

下面是对第一个API接口进行访问时返回的结果：

分析-03.png

因为我们的最终结果都是要存储在MySQL数据库中，为了操作上的方便，我直接使用了图形管理工具MySQL-Front新建了数据库：baidumap,并在里面新建两张表，表1city用来存储第一个API的结果，表2park用来存储第二个API的结果。表1结构如下：

分析-04.png

接下来就是写代码请求数据，并将结果存储在表city中：

import requests

import json

import MySQLdb

from datetime import datetime

#从txt文件中获取相关城市并重新生成一个列表

city_list=[]

with open('cities.txt','r',encoding='utf-8') as f:

for eachline in f:

if eachline !='' and eachline !='\n':

city=eachline.split('\t')[0]

city_list.append(city)

f.close()

#定义一个getjson函数用来解析返回的数据

def getjson(palace,page_num=0):

headers = {

'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'

}

url='http://api.map.baidu.com/place/v2/search'

params={

'q':'公园',

'region':palace,

'scope':'2',

'page_size':'20',

'page_num':page_num,

'output':'json',

'ak':'XM53LMurtNQaAPFuKVy1WzSyZCNmNA9H',

}

response=requests.get(url=url,params=params,headers=headers)

html=response.text

decodejson=json.loads(html)

return decodejson

#连接数据库、获得游标，获取数据并插入到数据库中

在获取数据时使用get()方法比较好，避免造成无相关数据时程序的中断

conn=MySQLdb.connect(host='localhost',user='root',password='root',db='baidumap',charset='utf8')

cur=conn.cursor()

for city in city_list:

not_last_page=True

page_num=0

while not_last_page:

decodejson=getjson(city,page_num)

print(city,page_num)

if decodejson.get('results'):

for result in decodejson.get('results'):

park=result.get('name')

lat=result.get('location').get('lat')

lng=result.get('location').get('lng')

address=result.get('address')

street_id=result.get('street_id')

uid=result.get('uid')

sql="""INSERT INTO baidumap.city

(city,park,location_lat,location_lng,address,street_id,uid,time)

VALUES (%s,%s,%s,%s,%s,%s,%s,%s);"""

cur.execute(sql,(city,park,lat,lng,address,street_id,uid,datetime.now()))

conn.commit()

page_num=page_num+1

else:

not_last_page=False

cur.close()

conn.close()

从MySQL导出数据的结果：

分析-05.png

二、尝试第二个API获取数据

第二个API接口：http://api.map.baidu.com/place/v2/detail?uid=xxxxx&output=json&scope=2&ak=用户的访问密匙

里面有一个参数uid，这个参数我们就从之前所保存的city表中进行获取，然后我们尝试对这个API进行访问，返回的结果是：

分析-06.png

在表park建立表结构如下：

分析-07.png

先从表city拿到uid，然后利用第二个API接口进行请求，拿到数据，存储至表park中，代码如下：

from datetime import datetime

import requests

import json

import MySQLdb

#city表中拿到uid

conn=MySQLdb.connect(host='localhost',user='root',password='root',db='baidumap',charset='utf8')

cur=conn.cursor()

sql="Select uid from baidumap.city WHERE id>0;"

cur.execute(sql)

conn.commit()

uids=cur.fetchall()

##定义一个getjson函数用来解析返回的数据

def getjson(uid):

try:

headers={

'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'

}

params={

'uid':uid,

'scope':'2',

'output':'json',

'ak':'XM53LMurtNQaAPFuKVy1WzSyZCNmNA9H',

}

url='http://api.map.baidu.com/place/v2/detail'

response=requests.get(url=url,headers=headers,params=params)

html=response.text

decodejson=json.loads(html)

return decodejson

except:

pass

#获取数据，存储数据

for uid in uids:

uid=uid[0]

print(uid)

decodejson=getjson(uid)

data=decodejson.get('result')

if data:

park=data.get('name')

location_lat = data.get('location').get('lat')

location_lng=data.get('location').get('lng')

address=data.get('address')

street_id=data.get('street_id')

telephone=data.get('telephone')

detail=data.get('detail')

uid=data.get('uid')

tag=data.get('detail_info').get('tag')

detail_url=data.get('detail_info').get('detail_url')

type=data.get('detail_info').get('type')

overall_rating=data.get('detail_info').get('overall_rating')

image_num=data.get('detail_info').get('image_num')

comment_num=data.get('detail_info').get('comment_num')

shop_hours=data.get('detail_info').get('shop_hours')

alias=data.get('detail_info').get('alias')

scope_type=data.get('detail_info').get('scope_type')

scope_grade=data.get('detail_info').get('scope_grade')

description=data.get('detail_info').get('description')

sql="""INSERT INTO baidumap.park(park,location_lat,location_lng,address,street_id,telephone,

detail,uid,tag,detail_url,type,overall_rating,image_num,comment_num,shop_hours,alias,scope_type,scope_grade,

description,time) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s);"""

cur.execute(sql,(park,location_lat,location_lng,address,street_id,telephone,

detail,uid,tag,detail_url,type,overall_rating,image_num,comment_num,shop_hours,alias,scope_type,scope_grade,

description,datetime.now()))

conn.commit()

cur.close()

conn.close()

从MySQL导出数据的结果：

分析-08.png

python 接入百度地图数据包下载_Python爬虫-利用百度地图API接口爬取数据并保存至MySQL数据库...相关推荐

python爬虫百度云资源分享吧_python爬虫入门实战（七）---爬取并闪存微信群里的百度云资源...
(声明:本篇文章以交流技术为目的,希望大家支持正版,支持院线~) 需求背景: 最近误入一个免费(daoban)资源的分享群(正经脸),群里每天都在刷资源链接.但是大家都知道,百度云的分享链接是很容易被 ...
python导入数据库的数据怎么在qt界面里刷新_python爬虫开发通过qt界面显示，爬取数据只能显示最后一次循环结果，appen追加时程序卡住。（已经解决）！！！...
源码如下: # -*- coding: utf-8 -*- # Form implementation generated from reading ui file 'day01.ui' # # C ...
用 Python selenium爬取股票新闻并存入mysql数据库中带翻页功能demo可下载
用 Python selenium爬取实时股票新闻并存入mysql数据库中 1.分析需求 2.创建表 3.分析需要爬取的网页内容 4.python里面selenium进行爬虫操作 1.添加包 2.连接 ...
python爬去新浪微博_!如何通过python调用新浪微博的API来爬取数据
python抓取新浪微博,求教爬手机端可以参考的代码, #-*-coding:utf8-*- import smtplib from email.mime.text import MIMEText ...
python爬取y80s电影并插入mysql数据库
python爬取y80s电影并插入mysql数据库需求分析功能代码实现爬取功能本地查询功能运行情况爬取功能: 查询功能: 存在问题需求分析 1.学习中-写着玩 2.用到的库:reque ...
python爬历年大学生就业数据_Python就业行情和前景分析之一爬取数据
Python就业行情和前景分析之一爬取数据 http://zzdxb.baikezh.com/ 最近Python大热,就想要分析一下相关的市场需求,看一下Python到底集中在哪些城市,企业对Pyth ...
python爬虫入门实战---------一周天气预报爬取_Python爬虫入门实战--------一周天气预报爬取【转载】【没有分析...
Python爬虫入门实战--------一周天气预报爬取[转载][没有分析 Python爬虫入门实战--------一周天气预报爬取[转载][没有分析] 来源:https://blog.csdn.ne ...
知乎python练手的_Python—爬虫之初级实战项目：爬取知乎任一作者的文章练手
爬虫之初级实战项目:爬取知乎任一作者的文章练手在正式上代码之前,先过一遍之前所学知识的框架内容,温故而知新!!! 接下来我们直接上代码,一定要手敲代码.手敲代码.手敲代码!!! import req ...
python爬取酒店信息_Python 爬虫练手项目—酒店信息爬取
from bs4 import BeautifulSoup import requests import time import re url = 'http://search.qyer.com/ho ...

python 接入百度地图数据包下载_Python爬虫-利用百度地图API接口爬取数据并保存至MySQL数据库...

python 接入百度地图数据包下载_Python爬虫-利用百度地图API接口爬取数据并保存至MySQL数据库...相关推荐

最新文章

热门文章