{

"cells": [

{

"cell_type": "markdown",

"metadata": {},

"source": [

"在这个教程中,你将会学到如何用高德地图api抓取行政区划\n",

"\n",

"

提供的基础数据是:

\n",

" 没有,我们的数据无中生有

"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"# 观察网络连接行为"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"我们从高德地图抓,先观察一下如果在高德地图输入深圳的某一个行政区划查询,它的网络连接行为是怎样的\n",

" \n",

"谷歌浏览器右键检查,或者点设置里面的开发者工具,再点network选项,可以看到网络的连接行为(其他浏览器也有类似的功能,需要找一找)\n",

"\n",

"爬虫的原理在这里我们这里用到的是爬虫2.0\n",

"\n",

"每个网络访问中,有\n",

"\n",

" Response Headers(响应头)\n",

" Request Headers(请求头)\n",

" Query String Parameters(查询参数)\n",

" \n",

"其中,请求头和查询参数是我们要关注的"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"# json数据格式"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"在网络访问行为中,对方服务器返回给我们的数据是json结构,那么json是什么呢"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"先补充一下基础知识,学习一下python的\n",

"\n",

"list\n",

"\n",

"dict\n",

"\n",

"tuple\n",

"\n",

"把list,dict,tuple自由组合起来就变成了json\n",

"\n",

"JSON 实例"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"直接从高德地图抓是比较困难的,有防爬机制\n",

"\n",

"不过,高德专门为开发者提供了抓数据的接口\n",

"\n",

"高德地图开放平台\n",

"\n",

"各位需要注册一下高德开发者申请一个key"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"在其中,高德已经给我们提供了开发者专用的行政区查询服务,以及相关说明行政区查询\n",

"\n",

"在其中选择一个行政区查询,然后看看网络连接行为吧"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"# 开始抓行政区划"

]

},

{

"cell_type": "code",

"execution_count": 1,

"metadata": {

"ExecuteTime": {

"end_time": "2020-01-19T03:44:12.354768Z",

"start_time": "2020-01-19T03:44:11.803160Z"

}

},

"outputs": [],

"source": [

"#导入必要的爬虫包\n",

"import urllib\n",

"from urllib import parse\n",

"from urllib import request\n",

"\n",

"import pandas as pd\n",

"#导入json包,后面解析json数据\n",

"import json"

]

},

{

"cell_type": "code",

"execution_count": 2,

"metadata": {

"ExecuteTime": {

"end_time": "2020-01-19T03:44:13.091717Z",

"start_time": "2020-01-19T03:44:13.086729Z"

}

},

"outputs": [],

"source": [

"mykey = '在此输入你的key'\n",

"#这个输入你开发者key,告诉高德这个数据是你抓的,每天会有限额,你们可以注册成为开发者,这样就有自己的key拉"

]

},

{

"cell_type": "code",

"execution_count": 3,

"metadata": {

"ExecuteTime": {

"end_time": "2020-01-19T03:44:16.032917Z",

"start_time": "2020-01-19T03:44:14.094053Z"

}

},

"outputs": [],

"source": [

"keywords = '罗湖区'\n",

"\n",

"#查询的接口地址\n",

"url = 'https://restapi.amap.com/v3/config/district?'\n",

"\n",

"#查询的条件\n",

"dict1 = {\n",

"'subdistrict':'3',\n",

" 'showbiz':'false',\n",

" 'extensions':'all',\n",

" 'key':mykey,#这个是我的开发者key,告诉高德这个数据是我抓的,每天会有限额,你们可以注册成为开发者,这样就有自己的key拉\n",

" 's':'rsv3',\n",

" 'output':'json',\n",

" 'level':'district',\n",

" 'keywords':keywords,\n",

" 'platform':'JS',\n",

" 'logversion':'2.0',\n",

" 'sdkversion':'1.4.10'\n",

"}\n",

"\n",

"#把查询条件组合成网页地址\n",

"url_data = parse.urlencode(dict1)\n",

"url = url+url_data\n",

"\n",

"#创建一个访问器\n",

"request = urllib.request.Request(url)\n",

"\n",

"#访问网页\n",

"response = urllib.request.urlopen(request)\n",

"\n",

"#读取网页内容\n",

"webpage = response.read()\n",

"\n",

"#将内容用json解析\n",

"result = json.loads(webpage.decode('utf8','ignore'))"

]

},

{

"cell_type": "code",

"execution_count": 4,

"metadata": {

"ExecuteTime": {

"end_time": "2020-01-19T03:44:17.628606Z",

"start_time": "2020-01-19T03:44:17.611648Z"

},

"scrolled": false

},

"outputs": [

{

"data": {

"text/plain": [

"{'status': '1',\n",

" 'info': 'OK',\n",

" 'infocode': '10000',\n",

" 'count': '1',\n",

" 'suggestion': {'keywords': [], 'cities': []},\n",

" 'districts': [{'citycode': '0755',\n",

" 'adcode': '440303',\n",

" 'name': '罗湖区',\n",

" 'polyline': '114.105177,22.531626;114.104808,22.532512;114.104774,22.535038;114.104757,22.535105;114.104772,22.5352;114.104764,22.535834;114.104699,22.540773;114.104687,22.541316;114.104589,22.546031;114.104519,22.547975;114.104464,22.548114;114.104502,22.548445;114.104486,22.548663;114.10449,22.548786;114.104477,22.549163;114.104506,22.549251;114.104505,22.549327;114.10447,22.549363;114.104341,22.552936;114.104281,22.555434;114.104472,22.555779;114.104487,22.555809;114.104508,22.555845;114.104557,22.555933;114.104576,22.556013;114.104576,22.556037;114.104574,22.556168;114.10456,22.556475;114.10456,22.556561;114.104559,22.556903;114.104552,22.557291;114.104551,22.557399;114.104547,22.557726;114.104541,22.557852;114.104542,22.558166;114.104536,22.558579;114.104534,22.558701;114.104529,22.559019;114.104521,22.559395;114.104523,22.559815;114.104503,22.560243;114.104498,22.560353;114.104502,22.560685;114.104495,22.561075;114.104487,22.561174;114.104496,22.561506;114.104496,22.561921;114.104496,22.562368;114.104502,22.562489;114.104504,22.562812;114.104506,22.563216;114.104504,22.563617;114.104508,22.563748;114.104512,22.564046;114.104508,22.56422;114.104498,22.564475;114.104502,22.564899;114.104511,22.565285;114.104509,22.565474;114.104508,22.565722;114.104517,22.56614;114.104521,22.566593;114.104523,22.567017;114.104518,22.567455;114.104524,22.567872;114.104514,22.56789;114.104473,22.567887;114.104281,22.56785;114.104165,22.567827;114.104041,22.567776;114.103984,22.567714;114.103221,22.566996;114.101604,22.566261;114.100625,22.565896;114.099474,22.565566;114.098216,22.565256;114.096829,22.56491;114.095665,22.564683;114.094653,22.564538;114.093662,22.564387;114.09319,22.564326;114.092677,22.564341;114.092628,22.564336;114.092616,22.564338;114.092547,22.564334;114.092482,22.564333;114.092448,22.564344;114.092346,22.564394;114.092238,22.564462;114.092232,22.564471;114.092198,22.564515;114.092159,22.564576;114.092053,22.564749;114.091951,22.564934;114.091793,22.565191;114.091604,22.565503;114.091404,22.565793;114.09114,22.566201;114.091064,22.566319;114.090868,22.566609;114.090934,22.56668;114.090822,22.566674;114.090558,22.567061;114.090367,22.567349;114.090287,22.567475;114.090217,22.567571;114.090162,22.567638;114.08992,22.56787;114.089529,22.568249;114.089399,22.56837;114.089068,22.568689;114.088819,22.568935;114.088709,22.569037;114.088646,22.569078;114.088535,22.569171;114.088247,22.569416;114.088167,22.569495;114.088081,22.569532;114.08804,22.569578;114.087999,22.569556;114.087781,22.569619;114.087106,22.56982;114.086649,22.569956;114.08641,22.570031;114.086242,22.570088;114.086163,22.570103;114.085801,22.570148;114.085672,22.570161;114.085463,22.570188;114.085116,22.570236;114.084791,22.570277;114.084619,22.570303;114.084456,22.570309;114.084331,22.570304;114.084081,22.570304;114.083769,22.570298;114.083665,22.570292;114.083473,22.570284;114.083275,22.570264;114.083075,22.570242;114.082817,22.57021;114.082622,22.570188;114.082508,22.570159;114.08228,22.570096;114.082174,22.570069;114.081817,22.569965;114.081591,22.569903;114.081421,22.569841;114.081242,22.569769;114.080933,22.569619;114.080908,22.569568;114.08085,22.569578;114.080686,22.569497;114.080538,22.569445;114.080433,22.569393;114.080271,22.569293;114.079923,22.569056;114.079662,22.568882;114.079393,22.568705;114.079153,22.568538;114.079117,22.568517;114.079096,22.568519;114.079028,22.568567;114.078887,22.568673;114.078607,22.56885;114.078495,22.568924;114.078397,22.569016;114.078186,22.569219;114.078015,22.56938;114.077965,22.569438;114.077919,22.569561;114.077803,22.569836;114.077751,22.569966;114.077635,22.570266;114.077564,22.57047;114.077481,22.570689;114.077396,22.570895;114.077328,22.571063;114.077286,22.571168;114.077251,22.571219;114.077139,22.571326;114.076993,22.571499;114.076867,22.571625;114.076762,22.571693;114.076688,22.571739;114.07653,22.571842;114.076423,22.571923;114.076276,22.572064;114.076119,22.57222;114.076027,22.572324;114.075995,22.572373;114.075944,22.572466;114.075442,22.573406;114.074444,22.57527;114.074429,22.575304;114.07442,22.575387;114.074413,22.57552;114.074422,22.575687;114.074425,22.575768;114.074436,22.575915;114.074424,22.576026;114.074396,22.576131;114.072941,22.578101;114.07282,22.578252;114.072721,22.578413;114.07264,22.578544;114.07255,22.578743;114.072494,22.578864;114.072333,22.579066;114.072248,22.579179;114.072132,22.579286;114.071922,22.57948;114.071765,22.579608;114.071509,22.579729;114.071414,22.579777;114.071316,22.579822;114.071183,22.579844;114.070814,22.579893;114.070644,22.579942;114.070522,22.579985;114.070441,22.580055;114.069304,22.580983;114.069009,22.581218;114.068689,22.582049;114.067464,22.583932;114.066593,22.585036;114.066307,22.585671;114.066242,22.586446;114.066427,22.587767;114.06676,22.587997;114.067792,22.58822;114.068388,22.589547;114.069677,22.590484;114.070446,22.591161;114.070783,22.591517;114.071592,22.592746;114.073605,22.592416;114.076291,22.594081;114.076481,22.594279;114.07644,22.594435;114.076425,22.59446;114.076405,22.594494;114.076384,22.594524;114.076365,22.59456;114.076341,22.594601;114.076325,22.594627;114.075951,22.595047;114.075364,22.596361;114.07531,22.596547;114.075214,22.597194;114.075216,22.597212;114.075242,22.597386;114.075331,22.598095;114.075334,22.598101;114.075369,22.59811;114.075417,22.598111;114.075471,22.598105;114.075523,22.598098;114.076329,22.597851;114.077292,22.598071;114.078856,22.599039;114.079411,22.599596;114.081164,22.600857;114.082876,22.601197;114.083804,22.601219;114.085266,22.600967;114.086078,22.600389;114.089056,22.599067;114.089668,22.598512;114.089843,22.598586;114.090674,22.599398;114.09147,22.599906;114.09222,22.600078;114.093411,22.600147;114.095172,22.598901;114.096739,22.597389;114.097775,22.596069;114.099539,22.594177;114.099735,22.593308;114.099939,22.593058;114.100366,22.592807;114.100931,22.5927;114.102247,22.592214;114.10306,22.591685;114.103222,22.591579;114.103542,22.591372;114.103553,22.591365;114.103704,22.591266;114.105077,22.590145;114.105741,22.589396;114.105881,22.588793;114.105994,22.588302;114.106724,22.588183;114.107486,22.588193;114.108022,22.589015;114.110447,22.589194;114.111385,22.589313;114.111392,22.589314;114.111561,22.589353;114.111573,22.589357;114.112218,22.589464;114.112519,22.589649;114.112562,22.589674;114.112689,22.589744;114.112708,22.589755;114.112738,22.589772;114.112766,22.589787;114.112811,22.589811;114.112879,22.589847;114.112937,22.589878;114.114131,22.590493;114.114778,22.590676;114.114877,22.590688;114.11497,22.5907;114.114983,22.590702;114.115069,22.590714;114.115508,22.590837;114.115691,22.590902;114.117447,22.591251;114.119557,22.589983;114.119849,22.590075;114.119948,22.590107;114.119978,22.590116;114.121215,22.590506;114.121722,22.590666;114.121915,22.589584;114.122051,22.589317;114.122192,22.589041;114.122209,22.589007;114.122662,22.588988;114.122856,22.588973;114.123263,22.588939;114.123467,22.588929;114.12343,22.588161;114.123462,22.587963;114.123564,22.587968;114.123719,22.58773;114.124542,22.588289;114.124796,22.588661;114.12493,22.588682;114.12602,22.588761;114.128704,22.588944;114.129348,22.588701;114.129869,22.588308;114.130275,22.587676;114.13049,22.587318;114.130822,22.586694;114.131852,22.587407;114.133097,22.588299;114.133204,22.588735;114.13388,22.589161;114.134236,22.588814;114.13528,22.588108;114.135606,22.588144;114.136787,22.588331;114.138444,22.58874;114.139069,22.588748;114.139319,22.589018;114.13979,22.590095;114.139994,22.590331;114.141456,22.591084;114.141934,22.591243;114.143284,22.591405;114.144299,22.591268;114.145053,22.591772;114.145693,22.592358;114.145957,22.592811;114.145959,22.593802;114.145958,22.593844;114.145945,22.594116;114.145873,22.594481;114.145823,22.594669;114.145768,22.594911;114.146004,22.595483;114.14601,22.595499;114.146049,22.595533;114.146147,22.595613;114.146251,22.595698;114.146363,22.595791;114.146399,22.595822;114.146478,22.59589;114.146597,22.595994;114.146624,22.596017;114.150262,22.599342;114.151339,22.600002;114.152247,22.600358;114.152438,22.600394;114.153528,22.600911;114.15425,22.601428;114.154631,22.601454;114.154747,22.60145;114.154824,22.601445;114.157929,22.601334;114.159134,22.601195;114.159579,22.601232;114.159944,22.601389;114.160279,22.601884;114.160886,22.603346;114.161572,22.604487;114.161638,22.605271;114.163465,22.606384;114.163541,22.60644;114.16362,22.606499;114.163696,22.606555;114.163752,22.606596;114.163782,22.606619;114.163815,22.606642;114.163908,22.606709;114.163972,22.606755;114.164414,22.607122;114.164561,22.607248;114.165302,22.607886;114.165449,22.608013;114.166038,22.608519;114.166184,22.608646;114.16693,22.609288;114.167471,22.609519;114.17063,22.610244;114.172673,22.610358;114.173934,22.610991;114.175158,22.61219;114.175914,22.613807;114.176014,22.614284;114.177174,22.615588;114.17745,22.615527;114.181628,22.61599;114.181786,22.615914;114.182252,22.615341;114.183195,22.614909;114.184766,22.615289;114.186902,22.61519;114.187,22.615187;114.187096,22.615183;114.187185,22.61518;114.187194,22.61518;114.188777,22.615241;114.19169,22.616084;114.192145,22.616449;114.192406,22.61619;114.196861,22.613296;114.199387,22.609757;114.199502,22.6096;114.199765,22.609247;114.199861,22.609124;114.199882,22.609093;114.200355,22.60774;114.200348,22.607492;114.200202,22.607034;114.199242,22.606371;114.198901,22.605669;114.198702,22.604842;114.198788,22.603977;114.198742,22.60325;114.198867,22.60304;114.200081,22.602383;114.201067,22.602186;114.202343,22.601505;114.203151,22.601184;114.204079,22.601417;114.205452,22.601953;114.206973,22.601653;114.207583,22.601157;114.208871,22.599357;114.209512,22.598772;114.210086,22.598889;114.211343,22.599338;114.212641,22.599632;114.213228,22.599614;114.213842,22.599288;114.214031,22.598876;114.214085,22.59829;114.213704,22.596583;114.213677,22.59607;114.215775,22.593767;114.216158,22.593137;114.216721,22.59246;114.217485,22.591033;114.217635,22.590298;114.217754,22.590079;114.218556,22.589512;114.219021,22.588915;114.219105,22.588596;114.219037,22.587972;114.219352,22.585877;114.219593,22.585003;114.219527,22.58458;114.218891,22.583076;114.218763,22.581919;114.218845,22.581043;114.218854,22.580966;114.218893,22.580599;114.218907,22.580476;114.218921,22.580341;114.218924,22.580327;114.218929,22.580289;114.21895,22.580217;114.219004,22.580125;114.219105,22.579931;114.219143,22.579845;114.219192,22.579787;114.219312,22.579625;114.219433,22.579473;114.219506,22.579339;114.219574,22.579246;114.219615,22.579142;114.219677,22.578972;114.219685,22.578909;114.21977,22.578649;114.219906,22.57831;114.21997,22.578174;114.220004,22.577996;114.220011,22.577863;114.220006,22.577782;114.219971,22.577686;114.219919,22.577606;114.219824,22.577488;114.219743,22.577401;114.219606,22.577311;114.219482,22.577216;114.219275,22.577084;114.219048,22.576943;114.21892,22.576857;114.218743,22.576737;114.218563,22.576617;114.2185,22.576548;114.218434,22.57645;114.21815,22.57608;114.217947,22.575801;114.217677,22.575486;114.217657,22.575451;114.217579,22.575333;114.217519,22.575269;114.217434,22.575164;114.217332,22.575083;114.217263,22.575032;114.217172,22.575006;114.215717,22.57436;114.215487,22.57425;114.215114,22.57412;114.214703,22.574005;114.214006,22.573861;114.213927,22.57386;114.213691,22.573858;114.213578,22.573868;114.213208,22.5739;114.213044,22.57392;114.212851,22.573924;114.212531,22.573916;114.21242,22.573912;114.212315,22.57389;114.21218,22.573881;114.212025,22.573863;114.211861,22.573843;114.211728,22.57382;114.211613,22.573798;114.211474,22.573757;114.21134,22.573722;114.21115,22.573678;114.210974,22.57363;114.210726,22.573565;114.210551,22.573517;114.210444,22.573486;114.210394,22.573447;114.210304,22.573396;114.208112,22.572756;114.208053,22.572771;114.208019,22.572761;114.20785,22.572709;114.207699,22.57265;114.207515,22.572575;114.207469,22.572541;114.207406,22.572495;114.20733,22.572442;114.207236,22.572363;114.207125,22.572287;114.206886,22.572112;114.20677,22.572024;114.206701,22.57196;114.206544,22.571811;114.206268,22.571552;114.206081,22.571364;114.205976,22.571265;114.205833,22.571132;114.205621,22.570991;114.205471,22.570898;114.205326,22.570794;114.205056,22.570621;114.204811,22.570497;114.204567,22.570372;114.204187,22.570192;114.203832,22.570001;114.203761,22.569949;114.203461,22.569306;114.203444,22.569249;114.203442,22.569179;114.203445,22.569021;114.203476,22.568839;114.203532,22.568614;114.203606,22.568371;114.203673,22.568165;114.203785,22.567846;114.203871,22.567678;114.20405,22.567334;114.204124,22.567195;114.204306,22.566849;114.204382,22.56671;114.204441,22.566591;114.204536,22.566448;114.204662,22.566261;114.204751,22.566134;114.204816,22.566015;114.204968,22.565799;114.205097,22.565611;114.205259,22.565368;114.205332,22.565266;114.205414,22.565138;114.205459,22.565042;114.205512,22.564925;114.205581,22.564764;114.205693,22.564543;114.205744,22.564435;114.205817,22.564258;114.205967,22.563971;114.206111,22.563694;114.206221,22.563493;114.206356,22.563258;114.206472,22.563041;114.206632,22.562762;114.206716,22.562605;114.20691,22.562277;114.206999,22.56212;114.207064,22.562024;114.207089,22.561944;114.20718,22.561643;114.207256,22.561409;114.20728,22.561283;114.207301,22.561166;114.207337,22.560946;114.207352,22.560862;114.207344,22.56079;114.207331,22.560682;114.207298,22.560464;114.207269,22.560333;114.207225,22.560206;114.207172,22.560063;114.207141,22.559953;114.207116,22.559871;114.207051,22.559732;114.207001,22.559628;114.206942,22.559507;114.206865,22.559404;114.20676,22.559265;114.206632,22.559091;114.206466,22.558856;114.206313,22.558653;114.206225,22.558534;114.206185,22.558482;114.206066,22.558385;114.205817,22.558181;114.20558,22.557978;114.205377,22.557815;114.205312,22.557753;114.205149,22.557572;114.204992,22.557407;114.204927,22.557347;114.204789,22.557201;114.204688,22.55708;114.20452,22.556892;114.204358,22.556698;114.204279,22.556606;114.204226,22.556558;114.204161,22.556523;114.203951,22.556407;114.203665,22.55627;114.203537,22.5562;114.203441,22.556149;114.203402,22.556119;114.203367,22.556053;114.20329,22.55595;114.203231,22.555863;114.203207,22.555813;114.203183,22.555711;114.203145,22.555571;114.203136,22.555519;114.203147,22.555413;114.203177,22.55531;114.202883,22.5539;114.202754,22.553899;114.202521,22.55398;114.202209,22.554141;114.201941,22.554291;114.201733,22.55435;114.201586,22.554361;114.201456,22.554324;114.201352,22.554255;114.201248,22.554108;114.200893,22.553498;114.200711,22.553288;114.200521,22.553139;114.200295,22.553028;114.20007,22.55294;114.199386,22.552822;114.198875,22.552796;114.198494,22.552816;114.197897,22.552928;114.197447,22.55299;114.19723,22.553007;114.197048,22.552973;114.196823,22.552884;114.196277,22.552449;114.195706,22.551958;114.195507,22.551813;114.195212,22.551704;114.194745,22.551606;114.194537,22.551586;114.194329,22.551604;114.194139,22.551643;114.194009,22.551722;114.193767,22.551926;114.193368,22.552337;114.193109,22.552618;114.19297,22.552681;114.192814,22.552705;114.192641,22.552692;114.192468,22.552625;114.192252,22.5524;114.191966,22.552108;114.191836,22.552016;114.191715,22.551971;114.191585,22.551952;114.191438,22.551946;114.191152,22.551985;114.189139,22.552416;114.188649,22.552449;114.187848,22.552506;114.186797,22.552674;114.186595,22.552588;114.186408,22.552367;114.186202,22.552013;114.186112,22.551754;114.18613,22.551447;114.185995,22.551331;114.185918,22.551318;114.185666,22.551454;114.185301,22.551711;114.184492,22.552141;114.18271,22.552801;114.182626,22.552918;114.182754,22.553378;114.182578,22.554804;114.182573,22.555331;114.182426,22.556155;114.18232,22.556777;114.182161,22.556946;114.180945,22.557268;114.180688,22.557212;114.180412,22.557245;114.179739,22.557245;114.179168,22.557044;114.178236,22.556842;114.17726,22.556905;114.175509,22.556898;114.175138,22.557122;114.174783,22.557436;114.174439,22.55805;114.174222,22.558222;114.173964,22.558376;114.173687,22.558451;114.173305,22.558488;114.1728,22.558514;114.172317,22.558465;114.171984,22.55826;114.17158,22.557871;114.170993,22.557148;114.170581,22.556718;114.170035,22.556441;114.169522,22.55634;114.168954,22.556373;114.168534,22.556422;114.16819,22.556575;114.168033,22.557009;114.167872,22.557163;114.167412,22.55827;114.167094,22.558775;114.166675,22.559093;114.166436,22.559157;114.166078,22.559117;114.165592,22.558891;114.164436,22.557572;114.164003,22.556773;114.16342,22.556178;114.162237,22.554049;114.161077,22.552626;114.160765,22.552309;114.160592,22.552203;114.160445,22.552134;114.160176,22.552056;114.159925,22.552031;114.158349,22.552083;114.157674,22.552074;114.15738,22.552016;114.157155,22.551944;114.157008,22.551819;114.156877,22.551621;114.156774,22.551233;114.156739,22.550866;114.156782,22.54982;114.156774,22.549423;114.156722,22.548961;114.156653,22.548735;114.156583,22.548597;114.156401,22.548399;114.156185,22.548241;114.15602,22.548146;114.155787,22.548121;114.155466,22.548121;114.155293,22.548108;114.155154,22.548056;114.155076,22.548007;114.154999,22.547857;114.154895,22.547545;114.154817,22.5471;114.154747,22.54662;114.154652,22.546297;114.154618,22.546112;114.154609,22.545928;114.154643,22.545405;114.154704,22.544741;114.154764,22.544567;114.154834,22.544469;114.154972,22.544406;114.155345,22.544295;114.155829,22.544214;114.156046,22.544248;114.156098,22.544294;114.156089,22.544391;114.155951,22.544676;114.15583,22.544969;114.155856,22.545138;114.155959,22.545275;114.156167,22.545326;114.156418,22.545298;114.156617,22.545182;114.156721,22.544985;114.156773,22.544477;114.156721,22.543976;114.156661,22.543813;114.156505,22.543694;114.156029,22.543601;114.155535,22.543479;114.155301,22.543331;114.154712,22.542658;114.153751,22.541618;114.153474,22.541305;114.15331,22.540932;114.153223,22.540692;114.153206,22.540501;114.153257,22.540199;114.153483,22.539684;114.153543,22.539549;114.153526,22.539406;114.153491,22.539294;114.153387,22.539217;114.153136,22.539114;114.152089,22.538875;114.151093,22.538672;114.150495,22.538521;114.150131,22.538483;114.149889,22.538487;114.149733,22.538509;114.149647,22.538571;114.149595,22.538668;114.149482,22.538983;114.149387,22.539159;114.149257,22.539268;114.149162,22.539301;114.148893,22.53929;114.148642,22.539273;114.148452,22.539284;114.148183,22.539354;114.148044,22.539385;114.147897,22.539402;114.147768,22.539403;114.147612,22.539386;114.147421,22.539338;114.147274,22.539308;114.147118,22.539319;114.147023,22.539367;114.146936,22.539439;114.14685,22.539543;114.146711,22.539734;114.146676,22.539784;114.146529,22.539941;114.146399,22.540034;114.146252,22.540084;114.146036,22.5401;114.145741,22.540098;114.145455,22.540079;114.145169,22.540116;114.14491,22.540171;114.144373,22.540423;114.143888,22.540628;114.143489,22.540729;114.143022,22.540777;114.142511,22.540736;114.14206,22.540594;114.141627,22.540341;114.141012,22.539884;114.140241,22.53924;114.139808,22.538958;114.139401,22.538809;114.138908,22.538751;114.138353,22.538763;114.138084,22.538828;114.137461,22.538998;114.136759,22.539158;114.136335,22.539189;114.135893,22.539159;114.135503,22.539071;114.135104,22.538847;114.134698,22.538373;114.134221,22.537748;114.13377,22.537351;114.13338,22.537102;114.133025,22.536975;114.132661,22.536895;114.132326,22.53691;114.13222,22.536914;114.132064,22.536921;114.131492,22.537042;114.131059,22.537111;114.13079,22.537081;114.130478,22.537002;114.128736,22.536122;114.128095,22.535754;114.127567,22.535368;114.127124,22.534913;114.126587,22.534249;114.126206,22.533718;114.125616,22.532776;114.125357,22.532438;114.125062,22.532211;114.124698,22.532072;114.124169,22.531944;114.122549,22.531614;114.122072,22.531494;114.121725,22.531306;114.1215,22.531067;114.121326,22.530651;114.121257,22.530145;114.121214,22.529679;114.121006,22.529141;114.120962,22.529054;114.120694,22.528637;114.120494,22.528418;114.120226,22.528232;114.119567,22.527885;114.118752,22.527477;114.11818,22.52722;114.117608,22.526977;114.117274,22.526936;114.116713,22.527048;114.115635,22.52732;114.115032,22.52772;114.114737,22.528107;114.114591,22.528608;114.114601,22.529181;114.114649,22.530064;114.114768,22.531119;114.114823,22.532189;114.114842,22.532803;114.11463,22.533483;114.114268,22.533793;114.113769,22.533842;114.113372,22.53358;114.113054,22.532564;114.112716,22.532019;114.112211,22.531685;114.111554,22.531592;114.110922,22.531717;114.109584,22.532263;114.108715,22.532395;114.107912,22.532324;114.107577,22.532253;114.106988,22.532069;114.105943,22.531799;114.105177,22.531626',\n",

" 'center': '114.123885,22.555341',\n",

" 'level': 'district',\n",

" 'districts': [{'citycode': '0755',\n",

" 'adcode': '440303',\n",

" 'name': '笋岗街道',\n",

" 'center': '114.104,22.5621',\n",

" 'level': 'street',\n",

" 'districts': []},\n",

" {'citycode': '0755',\n",

" 'adcode': '440303',\n",

" 'name': '东门街道',\n",

" 'center': '114.116,22.5428',\n",

" 'level': 'street',\n",

" 'districts': []},\n",

" {'citycode': '0755',\n",

" 'adcode': '440303',\n",

" 'name': '黄贝街道',\n",

" 'center': '114.157,22.5662',\n",

" 'level': 'street',\n",

" 'districts': []},\n",

" {'citycode': '0755',\n",

" 'adcode': '440303',\n",

" 'name': '桂园街道',\n",

" 'center': '114.109,22.556',\n",

" 'level': 'street',\n",

" 'districts': []},\n",

" {'citycode': '0755',\n",

" 'adcode': '440303',\n",

" 'name': '清水河街道',\n",

" 'center': '114.082,22.5699',\n",

" 'level': 'street',\n",

" 'districts': []},\n",

" {'citycode': '0755',\n",

" 'adcode': '440303',\n",

" 'name': '南湖街道',\n",

" 'center': '114.113,22.5389',\n",

" 'level': 'street',\n",

" 'districts': []},\n",

" {'citycode': '0755',\n",

" 'adcode': '440303',\n",

" 'name': '东晓街道',\n",

" 'center': '114.123,22.589',\n",

" 'level': 'street',\n",

" 'districts': []},\n",

" {'citycode': '0755',\n",

" 'adcode': '440303',\n",

" 'name': '翠竹街道',\n",

" 'center': '114.128,22.574',\n",

" 'level': 'street',\n",

" 'districts': []},\n",

" {'citycode': '0755',\n",

" 'adcode': '440303',\n",

" 'name': '东湖街道',\n",

" 'center': '114.204,22.5699',\n",

" 'level': 'street',\n",

" 'districts': []},\n",

" {'citycode': '0755',\n",

" 'adcode': '440303',\n",

" 'name': '莲塘街道',\n",

" 'center': '114.205,22.565',\n",

" 'level': 'street',\n",

" 'districts': []}]}]}"

]

},

"execution_count": 4,

"metadata": {},

"output_type": "execute_result"

}

],

"source": [

"#数据中保存了polyline的线形\n",

"result"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"# 行政区划边界线型数据提取"

]

},

{

"cell_type": "code",

"execution_count": 14,

"metadata": {

"ExecuteTime": {

"end_time": "2019-09-11T10:07:06.479605Z",

"start_time": "2019-09-11T10:07:06.355734Z"

},

"scrolled": true

},

"outputs": [

{

"data": {

"text/html": [

"

\n",

"

" .dataframe tbody tr th:only-of-type {\n",

" vertical-align: middle;\n",

" }\n",

"\n",

" .dataframe tbody tr th {\n",

" vertical-align: top;\n",

" }\n",

"\n",

" .dataframe thead th {\n",

" text-align: right;\n",

" }\n",

"\n",

"

" \n",

"

\n",

"

\n",

"

lon_lst\n",

"

lat_lst\n",

"

\n",

"

\n",

"

\n",

"

\n",

"

0\n",

"

114.105177\n",

"

22.531626\n",

"

\n",

"

\n",

"

1\n",

"

114.104808\n",

"

22.532512\n",

"

\n",

"

\n",

"

2\n",

"

114.104774\n",

"

22.535038\n",

"

\n",

"

\n",

"

3\n",

"

114.104757\n",

"

22.535105\n",

"

\n",

"

\n",

"

4\n",

"

114.104772\n",

"

22.535200\n",

"

\n",

"

\n",

"

\n",

"

"

],

"text/plain": [

" lon_lst lat_lst\n",

"0 114.105177 22.531626\n",

"1 114.104808 22.532512\n",

"2 114.104774 22.535038\n",

"3 114.104757 22.535105\n",

"4 114.104772 22.535200"

]

},

"execution_count": 14,

"metadata": {},

"output_type": "execute_result"

}

],

"source": [

"#将polyline的线型从result中提取\n",

"polyline = result['districts'][0]['polyline']\n",

"\n",

"###############################你需要在下面写代码##################################\n",

"#想办法把polyline变成一个dataframe,两列,第一列是lon,第二列是lat\n",

"#提示:可以用for循环遍历\n",

"#或者可以用re包,一次指定多个分隔符分割字符串,然后用numpy的reshape方法,再变成dataframe\n",

"\n",

"\n",

"###################################################################################\n",

"\n",

"polyline.head(5)"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"# 火星坐标转换"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"从高德抓下来的坐标系是GCJ02,俗称火星坐标系统,而我们平时用的是WGS84坐标系,因此,需要进行坐标转换,转换的方法有大神开源出来了\n",

"\n"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"[geo_convert](https://github.com/gaussic/geo_convert)"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"好的,我们这里对每一行应用一下这个算法转换,建议使用geopandas的apply函数"

]

},

{

"cell_type": "code",

"execution_count": 18,

"metadata": {

"ExecuteTime": {

"end_time": "2019-09-11T10:11:06.230683Z",

"start_time": "2019-09-11T10:11:06.179076Z"

},

"scrolled": true

},

"outputs": [

{

"data": {

"text/html": [

"

\n",

"

" .dataframe tbody tr th:only-of-type {\n",

" vertical-align: middle;\n",

" }\n",

"\n",

" .dataframe tbody tr th {\n",

" vertical-align: top;\n",

" }\n",

"\n",

" .dataframe thead th {\n",

" text-align: right;\n",

" }\n",

"\n",

"

" \n",

"

\n",

"

\n",

"

lon_lst\n",

"

lat_lst\n",

"

\n",

"

\n",

"

\n",

"

\n",

"

0\n",

"

114.100060\n",

"

22.534309\n",

"

\n",

"

\n",

"

1\n",

"

114.099691\n",

"

22.535195\n",

"

\n",

"

\n",

"

2\n",

"

114.099657\n",

"

22.537721\n",

"

\n",

"

\n",

"

3\n",

"

114.099640\n",

"

22.537788\n",

"

\n",

"

\n",

"

4\n",

"

114.099655\n",

"

22.537883\n",

"

\n",

"

\n",

"

\n",

"

"

],

"text/plain": [

" lon_lst lat_lst\n",

"0 114.100060 22.534309\n",

"1 114.099691 22.535195\n",

"2 114.099657 22.537721\n",

"3 114.099640 22.537788\n",

"4 114.099655 22.537883"

]

},

"execution_count": 18,

"metadata": {},

"output_type": "execute_result"

}

],

"source": [

"###############################你需要在下面写代码##################################\n",

"#坐标转换\n",

"\n",

"###################################################################################\n",

"polyline.head(5)"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"# 用线型生成geodataframe"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"这个~前面教程有哦,写一写巩固一下"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"这里你要做的操作,就是将polyline变成shapely里面的Polygon,然后再转换为geodataframe"

]

},

{

"cell_type": "code",

"execution_count": 19,

"metadata": {

"ExecuteTime": {

"end_time": "2019-09-11T10:12:00.167002Z",

"start_time": "2019-09-11T10:11:58.202594Z"

}

},

"outputs": [

{

"data": {

"text/plain": [

""

]

},

"execution_count": 19,

"metadata": {},

"output_type": "execute_result"

},

{

"data": {

"image/png": "\n",

"text/plain": [

"

"

]

},

"metadata": {

"needs_background": "light"

},

"output_type": "display_data"

}

],

"source": [

"import pandas as pd\n",

"import numpy as np\n",

"import matplotlib as mpl\n",

"import matplotlib.pyplot as plt\n",

"import geopandas\n",

"from shapely.geometry import Point,Polygon,shape\n",

"\n",

"\n",

"###############################你需要在下面写代码##################################\n",

"#线型生成geodataframe\n",

"#提示Polygon(polyline.values)\n",

"###################################################################################\n",

"dataline.plot()"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"# 作业"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"## 抓全深圳的行政区划"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"可以把上面的抓取过程写成一个函数,输入是行政区划的名字,输出是坐标点或shapely的polygon或geodataframe\n",

"\n",

"这样只要循环一下,就可以生成全深圳的行政区划了"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"然后不要忘了把geopandas保存成shp"

]

},

{

"cell_type": "markdown",

"metadata": {},

"source": [

"特别注意:在一个行政区划由多个面组成时,坐标串中间以'|'分隔"

]

}

],

"metadata": {

"kernelspec": {

"display_name": "Python 3",

"language": "python",

"name": "python3"

},

"language_info": {

"codemirror_mode": {

"name": "ipython",

"version": 3

},

"file_extension": ".py",

"mimetype": "text/x-python",

"name": "python",

"nbconvert_exporter": "python",

"pygments_lexer": "ipython3",

"version": "3.6.5"

},

"toc": {

"base_numbering": 1,

"nav_menu": {},

"number_sections": true,

"sideBar": true,

"skip_h1_title": false,

"title_cell": "Table of Contents",

"title_sidebar": "Contents",

"toc_cell": false,

"toc_position": {

"height": "409.091px",

"left": "141px",

"top": "214.322px",

"width": "179px"

},

"toc_section_display": true,

"toc_window_display": true

}

},

"nbformat": 4,

"nbformat_minor": 2

}

一键复制

编辑

Web IDE

原始数据

按行查看

历史

java爬虫抓取行政区划_7-爬虫爬API抓取行政区划(urllib).ipynb相关推荐

  1. java 爬取百度云盘,百度网盘资源抓取 爬虫

    [实例简介] 简要实现了百度网盘资源的抓取,以及定时抓取,百度分享用户的抓取以及资源抓取,booststrap 页面 [实例截图] [核心代码] da0a5e77-e599-4f8f-829f-edb ...

  2. 基于java的网络爬虫框架(实现京东数据的爬取,并将插入数据库)

    本文为原创博客,仅供技术学习使用.未经允许,禁止将其复制下来上传到百度文库等平台. 目录 网络爬虫框架 网络爬虫的逻辑顺序 网络爬虫实例教学 model main util parse db 再看ma ...

  3. Jsoup:用Java也可以爬虫,怎么使用Java进行爬虫,用Java爬取网页数据,使用Jsoup爬取数据,爬虫举例:京东搜索

    Jsoup:用Java也可以爬虫,怎么使用Java进行爬虫,用Java爬取网页数据,使用Jsoup爬取数据,爬虫举例:京东搜索 一.资源 为什么接下来的代码中要使用el.getElementsByTa ...

  4. python爬虫抓取分页_Scrapy爬虫框架之Scrapy爬取分页数据(一)

    Python应用场景 Scrapy 爬虫框架 课程特色 时间就是生命,浓缩才是精华 (4小时,完成了Scrapy爬虫必备知识点讲解) 课程体系完整 ( 应用场景.Scrapy体系结构.分页爬虫.整站爬 ...

  5. Python之 - 使用Scrapy建立一个网站抓取器,网站爬取Scrapy爬虫教程

    Scrapy是一个用于爬行网站以及在数据挖掘.信息处理和历史档案等大量应用范围内抽取结构化数据的应用程序框架,广泛用于工业. 在本文中我们将建立一个从Hacker News爬取数据的爬虫,并将数据按我 ...

  6. 【用Java爬取网页图片——爬虫爬取数据】

    用Java爬取网页图片--爬虫爬取数据 1.在创建项目中导入jsoup 2.创建一个保存下载图片的路径 3.使用URL读取网页路径,jsoup读取网页内容 4.利用属性标签获取图片连接块 5.因为该路 ...

  7. python爬虫多久能学会-不踩坑的Python爬虫:如何在一个月内学会爬取大规模数据...

    原标题:不踩坑的Python爬虫:如何在一个月内学会爬取大规模数据 Python爬虫为什么受欢迎 如果你仔细观察,就不难发现,懂爬虫.学习爬虫的人越来越多,一方面,互联网可以获取的数据越来越多,另一方 ...

  8. 爬虫python下载电影_python爬虫:抓取下载电影文件,合并ts文件为完整视频

    目标网站:https://www.88ys.cc/vod-play-id-58547-src-1-num-1.html 反贪风暴4 对电影进行分析 我们发现,电影是按片段一点点加载出来的,我们分别抓取 ...

  9. Python爬虫-CSDN博客排行榜数据爬取

    文章目录 前言 网络爬虫 搜索引擎 爬虫应用 谨防违法 爬虫实战 网页分析 编写代码 运行效果 反爬技术 前言 开始接触 CTF 网络安全比赛发现不会写 Python 脚本的话简直寸步难行--故丢弃 ...

最新文章

  1. Linux环境变量设置中配置文件分析(/etc/profile,~/.bashrc等)(转)
  2. Git push file exceed GitHub's file size
  3. quartus 修改 时钟_FPGAQuartusII时钟约束.doc
  4. 【CodeForces - 1027B 】Numbers on the Chessboard (没有营养的找规律题,无聊题)
  5. beetle.express一通讯案例测试结果
  6. unity中动态生成网格
  7. 修理牧场 (25 分)(优先队列 简单)
  8. python扫码点餐系统_微信小程序源代码带后台 扫码点餐系统 python Django 前后端分离...
  9. BIOS INT中断整理
  10. python成语接龙
  11. 破解mariadb数据库密码
  12. 解决数据库报错:Table ‘*.*‘ doesn‘t exist错误
  13. Eclipse 导入maven项目报 Unknown Faceted Project Problem (Java Version Mismatch) 问题解决
  14. excel服务器项目管理软件,用excel做项目管理系统
  15. 【2019-游记】中山纪念中学暑期游Day3
  16. 云管平台 — vRealize Suite
  17. 潮汐护符服务器维护后,魔兽世界怀旧服潮汐护符怎么获得-魔兽世界怀旧服潮汐护符获得心得-pvp装备_牛游戏网...
  18. 2022年湖北省自然科学基金计划项目申请条件、要求和项目类型
  19. 珠海个人社保购买流程(灵活就业购买社保)
  20. 用C#对Illustrator矢量图形软件进行编程之2

热门文章

  1. 5W1H聊开源之Who和How——谁、如何参与开源?
  2. python dlib安装
  3. Docker compose file 中文参考文档
  4. Java开发环境:Win10安装最新版eclipse与jdk(超详细)
  5. 按QQ查询QQ群数据库的方法
  6. 算法:根据一个开关确定一百人是否都进入过房间
  7. 【电子器件笔记7】MOS管参数和选型
  8. CRNN端到端文本识别复现实践
  9. halcon光学字符识别(训练后识别),验证码识别
  10. matlab中信赖域法,第8讲信赖域方法.ppt