本文为原创博客,仅供技术学习使用。未经本人允许,不得将其复制下来上传到百度文库等平台。

目录

  • json数据
  • 爬虫中Json数据的解析
    • 分析要爬数据建立Model
    • main方法
    • json在线测试网站
    • 两种解析方式
  • 程序运行结果

json数据

JSON 是存储和交换文本信息的语法。类似 XML。JSON 比 XML 更小、更快,更易解析。JSON采用完全独立于语言的文本格式,但是也使用了类似于C语言家族的习惯(包括C、C++、C#、Java、JavaScript、Perl、Python等)。这些特性使JSON成为理想的数据交换语言。 易于人阅读和编写,同时也易于机器解析和生成(一般用于提升网络传输速率)。
JSON 数据的书写格式是:名称/值对。
如下所示:

{
"employees": [{ "firstName":"Bill", "lastName":"Gates"},{ "firstName":"George", "lastName":"Bush"},{ "firstName":"Thomas", "lastName":"Carter"}
]}

爬虫中Json数据的解析

以下,我将以一个简单的爬虫来解析爬虫中的Json数据。这里的爬虫写的比较简单,建议大家还是按照我前面写的爬虫框架来写,下面的主要是为了讲解Json的解析。

下面是爬时光网所写的一个样例程序:

分析要爬数据,建立Model

1。首先是框架中的model,封装要爬的数据。

package model;
/*   *  合肥工业大学 管理学院 qianyang 1563178220@qq.com*/
public class MtimeModel {private String prmovieId;private String url;private String movieId;private String title;public String getPrmovieId() {return prmovieId;}public void setPrmovieId(String prmovieId) {this.prmovieId = prmovieId;}public String getUrl() {return url;}public void setUrl(String url) {this.url = url;}public String getMovieId() {return movieId;}public void setMovieId(String movieId) {this.movieId = movieId;}public String getTitle() {return title;}public void setTitle(String title) {this.title = title;}}

main方法

2。来看看我们要爬的地址:http://movie.mtime.com/212471/trailer.html,以下程序是爬相关预告片的信息。下年是main方法

package main;
import java.io.IOException;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.List;
import model.MtimeModel;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import parse.MtimeParse;
/*   *  合肥工业大学 管理学院 qianyang 1563178220@qq.com*/
public class Mtime {static final Log logger = LogFactory.getLog(Mtime.class);public static void main(String[] args) throws IOException, SQLException {//测试程序String Starturl="http://movie.mtime.com/212471/trailer.html";Document doc=Jsoup.connect(Starturl).userAgent("bbb").timeout(120000).get();System.out.println(doc);List<MtimeModel> moviedatas=new ArrayList<MtimeModel>();moviedatas =MtimeParse.getData(doc);for (MtimeModel mt:moviedatas) {System.out.println("prmovieId:"+mt.getPrmovieId()+"  movieId:"+mt.getMovieId()+"  Title:"+mt.getTitle()+"   url:"+mt.getUrl());}}
}

下面是获取的网站源码,主要是解析这里面Json数据,下面的程序中请定位到var videos= 可以看到这后面就是Json数据。

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head> <title>澳门风云2 视频 预告片 – Mtime时光网</title> <meta name="Keywords" content="澳门风云2,The Man From Macao II,预告片视频,在线观看王晶,周润发,张家辉"> <meta name="Description" content="澳门风云2 视频 预告片"><meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <link type="image/x-icon" href="http://static1.mtime.cn/favicon.ico" rel="icon"> <link type="image/x-icon" href="http://static1.mtime.cn/favicon.ico" rel="shortcut icon"> <link type="image/x-icon" href="http://static1.mtime.cn/favicon.ico" rel="bookmark"> <link type="application/opensearchdescription+xml" href="http://feed.mtime.com/opensearch.xml" title="Mtime影视搜索" rel="search"> <link rel="alternate" type="application/rss+xml" title="影评" href="http://feed.mtime.com/comment.rss"> <link rel="alternate" type="application/rss+xml" title="日志" href="http://feed.mtime.com/blog.rss"> <link rel="alternate" type="application/rss+xml" title="资讯" href="http://feed.mtime.com/news.rss"> <link rel="alternate" type="application/rss+xml" title="话题" href="http://feed.mtime.com/topic.rss"> <link rel="alternate" type="application/rss+xml" title="周刊" href="http://feed.mtime.com/weekly.rss"> <script type="text/javascript">var server = "http://static1.mtime.cn/";var subServer = "http://static1.mtime.cn/library/";var version = "20160720105244";var subVersion = "20160623154218";var jsServer = server + version;var cssServer = server + version;var subJsServer = subServer + subVersion;var subCssServer = subServer + subVersion;var debug = false;var mtimeCookieDomain = "mtime.com";var siteLogUrl = "http://log.mtime.cn";var siteServiceUrl = "http://service.mtime.com";var siteLibraryServiceUrl = "http://service.library.mtime.com";var crossDomainUpload="http://upload3.mtime.com/Upload.ashx";</script> <script type="text/javascript">document.write(unescape("%3Clink href='" + cssServer + "/css/2014/publicpack.css' rel='stylesheet' media='all' type='text/css'%3E%3C/link%3E"));</script> <script type="text/javascript">document.write(unescape("%3Clink href='" + subCssServer + "/css/database.css' rel='stylesheet' media='all' type='text/css'%3E%3C/link%3E"));</script> </head><body> <script type="text/javascript">var navigationBarType = 1;document.writeln( "<div id=\"topbar\"></div><div id=\"managerHistoryRegion\"></div>");var debug = false;var mtimeCookieDomain="mtime.com";var siteLogUrl="http://log.mtime.cn";var siteUrl="http://www.mtime.com";var siteMcUrl="http://my.mtime.com";var siteApiUrl="http://api.mtime.com";var siteBlogUrl="http://i.mtime.com";var siteGroupUrl="http://group.mtime.com";var siteMovieUrl="http://movie.mtime.com";var sitePeopleUrl="http://people.mtime.com";var siteNewsUrl="http://news.mtime.com";var siteServiceUrl="http://service.mtime.com";var siteSearchUrl="http://search.mtime.com";var siteGoodsListUrl="http://list.mall.mtime.com";var theaterService="http://service.theater.mtime.com";var siteLibraryServiceUrl="http://service.library.mtime.com";var siteCommunityServiceUrl="http://service.community.mtime.com";var siteChannelServiceUrl="http://service.channel.mtime.com";var siteGoodsServiceUrl="http://service.mall.mtime.com";var siteTradeServiceUrl="http://trade.mtime.com";var siteFunUrl="";var sitePassportUrl="http://passport.mtime.com";var crossDomainUpload="http://upload3.mtime.com/Upload.ashx";var topMenuValues={"mainNavType":"Detail","footer":"<dt class=\"clearfix\"><span class=\"fr\">第179期</span><strong>时光周刊</strong></dt>\n                <dd><a href=\"http://www.mtime.com/weekly/\" target=\"_blank\" title=\"时光周刊\"><img src=\"http://img31.mtime.cn/mg/2016/08/19/103904.79768090.jpg\" width=\"170\" alt=\"时光周刊\"></a></dd>"};</script> <div id="db_sechead"> <div id="onlineTicketMovieRegion" class="db_ticket none"> </div> <div class="db_head"> <div class="clearfix"> <h1 property="v:itemreviewed"><a href="http://movie.mtime.com/212471/">澳门风云2</a></h1> <p class="db_year">(<a href="http://movie.mtime.com/movie/search/section/?year=2015" target="_blank">2015</a>)</p> <p class="db_enname"><a href="http://movie.mtime.com/212471/">The Man From Macao II</a></p> </div> </div> </div> <div class="db_nav db_secnav"> <dl id="movieNavigationRegion" class="clearfix"> <dd token="Generalize"><a href="http://movie.mtime.com/212471/"><span>&nbsp;</span>影片首页</a><i>&nbsp;</i></dd> <dd token="Video" _videocount="14"><a href="http://movie.mtime.com/212471/trailer.html"><span>14</span> 个视频</a><i>&nbsp;</i></dd> <dd token="Image" _imagecount="219"><a href="http://movie.mtime.com/212471/posters_and_images/"><span>219</span> 张图片</a><i>&nbsp;</i></dd> <dd token="Person"><a href="http://movie.mtime.com/212471/fullcredits.html"><span>38</span> 位演职员</a><i>&nbsp;</i></dd> <dd token="Review"><a href="http://movie.mtime.com/212471/comment.html"><span property="v:count" content="11190">999+</span> 条影评</a><i>&nbsp;</i></dd> <dd token="RelatedNews"><a href="http://movie.mtime.com/212471/news.html"><span>50</span> 条新闻</a><i>&nbsp;</i></dd> <dt class="more" id="detailMenuRegion"><a href="###">更多<em id="detailMenuRegionLabel">&nbsp;</em></a><i>&nbsp;</i> <dl class="db_nav_sel" id="detailSubMenuRegion" style="display:none"> <dt>&nbsp;</dt> <dd token="Synopsis"><a href="http://movie.mtime.com/212471/plots.html">剧情</a></dd> <dd token="Role" class="false"><a href="###">角色介绍</a></dd> <dd token="Trivia" class="false"><a href="###">幕后揭秘</a></dd> <dd token="Awards" class="false"><a href="###">获奖记录</a></dd> <dd token="Details"><a href="http://movie.mtime.com/212471/details.html">更多资料</a></dd> </dl> </dt> </dl> </div> <div class="db_videocont" id="allvideos"></div><div id="M13_B_DB_Movie_FooterTopTG"></div><script type="text/javascript">var videos = {"预告片":[{"VideoID":51655,"MovieID":212471,"Title":"澳门风云2 先行版预告片","ShortTitle":"先行版预告片","TitleSamll":"先行版预告片","Description":"","Length":"02:23","HD":1,"ImagePath":"http://img31.mtime.cn/mg/2014/11/27/184214.14086815_235X132X4.jpg","PlayCount":391607,"VideoType":0,"VideoTypeName":"预告片","Url":"http://video.mtime.com/51655/?mid=212471"},{"VideoID":52533,"MovieID":212471,"Title":"澳门风云2 剧场版预告片","ShortTitle":"剧情预告片“娱众不同”","TitleSamll":"剧情预告片“娱..","Description":"","Length":"02:12","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2015/01/21/173317.88939215_235X132X4.jpg","PlayCount":37601,"VideoType":0,"VideoTypeName":"预告片","Url":"http://video.mtime.com/52533/?mid=212471"},{"VideoID":52715,"MovieID":212471,"Title":"澳门风云2 剧场版预告片2","ShortTitle":"剧场版预告片2","TitleSamll":"剧场版预告片2","Description":"","Length":"01:29","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2015/02/04/102313.99825206_235X132X4.jpg","PlayCount":13877,"VideoType":0,"VideoTypeName":"预告片","Url":"http://video.mtime.com/52715/?mid=212471"},{"VideoID":53100,"MovieID":212471,"Title":"澳门风云 制作特辑之机器人PK海陆空","ShortTitle":"制作特辑之机器人PK海陆空","TitleSamll":"制作特辑之机器..","Description":"","Length":"01:39","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2015/03/06/111143.14920228_235X132X4.jpg","PlayCount":3372,"VideoType":0,"VideoTypeName":"预告片","Url":"http://video.mtime.com/53100/?mid=212471"}],"拍摄花絮":[{"VideoID":52769,"MovieID":212471,"Title":"澳门风云2 制作特辑之“世纪阵容大联欢”","ShortTitle":"制作特辑之“世纪阵容大联欢”","TitleSamll":"制作特辑之“世..","Description":"","Length":"02:40","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2015/02/06/163405.84085824_235X132X4.jpg","PlayCount":3075,"VideoType":2,"VideoTypeName":"拍摄花絮","Url":"http://video.mtime.com/52769/?mid=212471"},{"VideoID":52918,"MovieID":212471,"Title":"澳门风云2 “五代同堂合家欢”特辑","ShortTitle":"“五代同堂合家欢”特辑","TitleSamll":"“五代同堂合家..","Description":"","Length":"01:50","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2015/02/16/111822.61107798_235X132X4.jpg","PlayCount":529,"VideoType":2,"VideoTypeName":"拍摄花絮","Url":"http://video.mtime.com/52918/?mid=212471"},{"VideoID":52926,"MovieID":212471,"Title":"澳门风云2 制作特辑之七招过大年","ShortTitle":"制作特辑之七招过大年","TitleSamll":"制作特辑之七招..","Description":"","Length":"03:34","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2015/02/16/195152.85132238_235X132X4.jpg","PlayCount":2822,"VideoType":2,"VideoTypeName":"拍摄花絮","Url":"http://video.mtime.com/52926/?mid=212471"},{"VideoID":52929,"MovieID":212471,"Title":"澳门风云 制作特辑之羊年春节七天乐","ShortTitle":"制作特辑之羊年春节七天乐","TitleSamll":"制作特辑之羊年..","Description":"","Length":"03:33","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2015/02/17/093038.24212654_235X132X4.jpg","PlayCount":727,"VideoType":2,"VideoTypeName":"拍摄花絮","Url":"http://video.mtime.com/52929/?mid=212471"}],"更多":[{"VideoID":51667,"MovieID":212471,"Title":"澳门风云2 北京发布会","ShortTitle":"北京发布会","TitleSamll":"北京发布会","Description":"","Length":"02:04","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2014/11/27/233841.25950987_235X132X4.jpg","PlayCount":5278,"VideoType":4,"VideoTypeName":"更多","Url":"http://video.mtime.com/51667/?mid=212471"},{"VideoID":52540,"MovieID":212471,"Title":"澳门风云2 北京发布会","ShortTitle":"北京发布会","TitleSamll":"北京发布会","Description":"","Length":"01:35","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2015/01/21/225742.13087836_235X132X4.jpg","PlayCount":838,"VideoType":4,"VideoTypeName":"更多","Url":"http://video.mtime.com/52540/?mid=212471"},{"VideoID":52782,"MovieID":212471,"Title":"澳门风云2 北京首映式","ShortTitle":"北京首映式","TitleSamll":"北京首映式","Description":"","Length":"02:23","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2015/02/09/122417.54628077_235X132X4.jpg","PlayCount":1127,"VideoType":4,"VideoTypeName":"更多","Url":"http://video.mtime.com/52782/?mid=212471"},{"VideoID":52788,"MovieID":212471,"Title":"澳门风云2 片尾曲MV《财神到》","ShortTitle":"片尾曲MV《财神到》","TitleSamll":"片尾曲MV《财神..","Description":"","Length":"03:02","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2015/02/10/084357.52985713_235X132X4.jpg","PlayCount":919,"VideoType":5,"VideoTypeName":"更多","Url":"http://video.mtime.com/52788/?mid=212471"},{"VideoID":52794,"MovieID":212471,"Title":"澳门风云2 片尾曲“财神到”MV","ShortTitle":"片尾曲“财神到”MV","TitleSamll":"片尾曲“财神到..","Description":"","Length":"03:02","HD":0,"ImagePath":"http://img31.mtime.cn/mg/2015/02/10/110509.18710174_235X132X4.jpg","PlayCount":1698,"VideoType":5,"VideoTypeName":"更多","Url":"http://video.mtime.com/52794/?mid=212471"},{"VideoID":52894,"MovieID":212471,"Title":"澳门风云2 主题曲MV《停格》(演唱:蔡健雅)","ShortTitle":"主题曲MV《停格》(演唱:蔡健雅)","TitleSamll":"主题曲MV《停格..","Description":"","Length":"03:50","HD":1,"ImagePath":"http://img31.mtime.cn/mg/2015/02/14/103516.50781151_235X132X4.jpg","PlayCount":3090,"VideoType":5,"VideoTypeName":"更多","Url":"http://video.mtime.com/52894/?mid=212471"}]};</script><script type="text/javascript">if ( typeof(mtimeStufs) == "undefined" ) {mtimeStufs = [];}
mtimeStufs.push( {id:"M13_B_DB_Movie_FooterTopTG",type:"mtime",content:"<div class=\"tc pb15 pt15\" style=\"background:#fff;\">\n<iframe  src=\"http://static1.mtime.cn/tg/2011/2014_movieinfo_footer_banner_1000x90.html\" width=\"1000\" height=\"90\" frameborder=\"0\" border=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" allowtransparency=\"true\"></iframe>\n</div>\n"} );
mtimeStufs.push( {id:"M13_B_DB_Movie_ImageDetailPage_CommentRightTG",type:"mtime",content:"<div>\n<iframe  src=\"http://static1.mtime.cn/tg/2011/2014_movieinfo_picture_right_300x250.html\" width=\"300\" height=\"250\" frameborder=\"0\" border=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" allowtransparency=\"true\"></iframe>\n</div>\n"} );
mtimeStufs.push( {id:"M13_B_DB_Movie_OverviewHotMovieCommentRightTG1",type:"mtime",content:"<div class=\"mb12\">\n<iframe  src=\"http://static1.mtime.cn/tg/2011/2014_movieinfo_usercommentright_square_300x250.html\" width=\"300\" height=\"250\" frameborder=\"0\" border=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" allowtransparency=\"true\"></iframe>\n</div>\n\n"} );</script> <style type="text/css">#M13_B_DB_Movie_OverviewHotMovieCommentRightTG2{background:#fff;margin-bottom:-4000px;padding-bottom:4000px;}
.db_cont .db_inews { width:290px; float:left; display:inline; padding-right:26px; border-right:1px solid #ccc; }
.db_cont .db_inews dt { border-bottom:1px dotted #dcdcdc; font-size:12px; color:#666; line-height:1.6em; padding-bottom:10px; margin-bottom:3px; margin-top:10px; }
.db_cont .db_inews .imgbox { position:relative; zoom:1; overflow:hidden; }
.db_cont .db_inews .imgbox div { position:absolute; left:0; top:95px; overflow:hidden; padding:7px 12px; margin-right:30px; }
.db_cont .db_inews .imgbox div .bg { background:#fff; position:absolute; left:0; top:0; width:100%; height:100px; opacity:.7; filter:alpha(opacity=70); }
.db_cont .db_inews .imgbox h3 { position:relative; font-size:18px; line-height:1.4em; }
.db_cont .db_inews .imgbox a { color:#000; text-decoration:none; }
.db_cont .db_inews dd { font-size:12px; color:#666; padding-top:6px; }
#externalVideo{ display:none;}
.storeboxer li{height:215px;}
.db_headnews a, .db_headnews{z-index:10;position:relative;opacity:0;}
.db_headnews a{font-size:0; line-height:0;}</style><div id="bottom"></div> <script type="text/javascript">document.write(unescape("%3Cscript src='" + jsServer + "/js/systemall2014.js' type='text/javascript'%3E%3C/script%3E"));</script> <script type="text/javascript">document.write(unescape("%3Cscript src='" + subJsServer + "/js/moviepagepack.js' type='text/javascript'%3E%3C/script%3E"));</script> <script type="text/javascript">//页尾 导航//  静态文件初始化类new StaticManager({});</script> <div style="display: none"> <script type="text/javascript">var tracker = new Tracker();
tracker.trackPageView();</script> </div> <script type="text/javascript">window.moviePageBaseClient = new MoviePageBaseClient({ id: 212471, initializeMovieNavigationToken: "Video" });</script> <script type="text/javascript">$loadSubJs("/movie/MovieVideosPage.js", function () {            new MovieVideosPage();        });</script> <!--Generated at 2016-2-19 10:33:06 by Mtime Staticize Service.--></body>
</html>

json在线测试网站

将这里面的Json数据,放入到http://json.cn/来查看Json数据格式是否正确,如下图所示,完美啊:

两种解析方式

下面便是对网页中的Json进行解析,以下是解析程序,这里提供了两种方法,一种是正则表达式,一种是fastjson,建议使用fastjson,快捷高效。

package parse;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import model.JsonModel;
import model.MtimeModel;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.jsoup.nodes.Document;
import com.alibaba.fastjson.JSON;
/*   *  合肥工业大学 管理学院 qianyang 1563178220@qq.com*/
public class MtimeParse {static final Log logger = LogFactory.getLog(MtimeParse.class);public static List<MtimeModel> getData (Document doc)  {List<MtimeModel> mtimeData=new ArrayList<MtimeModel>();//获取待解析的html文件String html=doc.html();
//      System.out.println(html);//通过正则表达获取,所要解析的json数据,只要预告片不要花絮Pattern data = Pattern.compile("预告片\":(.*?)\\,\"拍摄花絮");Matcher dataMatcher = data.matcher(html);String da="";while (dataMatcher.find()) {//待解析的json字符串da=dataMatcher.group(1);}//jsoup获取movieId(影片id)String movieId="mtime"+doc.select("h1[property=v:itemreviewed]").select("a").attr("href").replaceAll("\\D","").trim();//正则匹配获取videoID(预告片id)Pattern videoID = Pattern.compile("VideoID\"(.*?)\"");//正则匹配获取videoID(预告片id)Pattern titlePattern = Pattern.compile("ShortTitle\":\"(.*?)\"");Matcher videoIDMatcher = videoID.matcher(da);Matcher titleMatcher = titlePattern.matcher(da);ArrayList<String> urldatas = new ArrayList<String>();while (videoIDMatcher.find()) {urldatas.add(videoIDMatcher.group(1));} ArrayList<String> titles = new ArrayList<String>();while (titleMatcher.find()) {titles.add(titleMatcher.group(1));} for (int i = 0; i < titles.size(); i++) {MtimeModel mtimeModel=new MtimeModel();String prmovieId="mtime"+urldatas.get(i).replaceAll("\\D","").trim();String url="http://video.mtime.com/"+urldatas.get(i).replaceAll("\\D","").trim()+"/?mid="+doc.select("h1[property=v:itemreviewed]").select("a").attr("href").replaceAll("\\D","").trim();String title=titles.get(i);mtimeModel.setPrmovieId(prmovieId);mtimeModel.setUrl(url);mtimeModel.setMovieId(movieId);mtimeModel.setTitle(title);mtimeData.add(mtimeModel);}//fastJson测试//just contain the previewList<JsonModel> mtimeJsonData=new ArrayList<JsonModel>();Pattern data1 = Pattern.compile("预告片\":(.*?)\\,(\"拍摄花絮|\"精彩片段)");Matcher dataMatcher1 = data1.matcher(html);String da1="";while (dataMatcher1.find()) {//待解析的json字符串da1=dataMatcher1.group(1);}if (da1.length()!=0) {List<JsonModel> jsonmodel1 = JSON.parseArray(da1,JsonModel.class);for (JsonModel jso:jsonmodel1 ) {JsonModel mtimeModel=new JsonModel();String VideoID="mtime"+jso.getVideoID();String MovieID="mtime"+jso.getMovieID();String ShortTitle=jso.getShortTitle();String url="http://video.mtime.com/"+jso.getVideoID()+"/?mid"+jso.getMovieID();mtimeModel.setPrmovieId(VideoID);mtimeModel.setUrl(url);mtimeModel.setMovieID(MovieID);mtimeModel.setShortTitle(ShortTitle);mtimeJsonData.add(mtimeModel);logger.info("VideoID:  "+VideoID+"  MovieID:"+MovieID+"  ShortTitle:"+ShortTitle+"  url:"+url);}}return mtimeData;}
}

程序运行结果

网络爬虫中Json数据的解析[以时光网为例]相关推荐

  1. IOS中JSON数据的解析

    IOS中JSON数据解析 官方为我们提供的解析JSON数据的类是NSJSONSerialization,首先我们先来看下这个类的几个方法: + (BOOL)isValidJSONObject:(id) ...

  2. 网络爬虫中进行数据抓取

    以下内容是<用python写网络爬虫>的读书笔记 一.安装firebug lite firebug lite是一个用于在网站中查看调试html,css和javascript的有效工具.它现 ...

  3. Java多线程网络爬虫(时光网为例)

    目录 多线程简介 多线程网络爬虫 分析要爬的数据 网络抓包 爬虫框架 model MtimeThread主方法 MtimeParse解析数据 数据库操作 多线程简介 Java多线程实现方式主要有三种: ...

  4. java爬取网页数据_Python网络爬虫实战(二)数据解析

    Python网络爬虫实战 (二)数据解析 本系列从零开始阐述如何编写Python网络爬虫,以及网络爬虫中容易遇到的问题,比如具有反爬,加密的网站,还有爬虫拿不到数据,以及登录验证等问题,会伴随大量网站 ...

  5. js html 处理json数据,JS中Json数据的处理和解析JSON数据的方法详解

    JSON(JavaScript Object Notation) 是一种轻量级的数据交换格式.它基于ECMAScript的一个子集. JSON采用完全独立于语言的文本格式,但是也使用了类似于C语言家族 ...

  6. python爬虫程序详解_Python网络爬虫之三种数据解析方式

    指定url 基于requests模块发起请求 获取响应对象中的数据 进行持久化存储 其实,在上述流程中还需要较为重要的一步,就是在持久化存储之前需要进行指定数据解析.因为大多数情况下的需求,我们都会指 ...

  7. Android中Json数据解析

    Android学习系列(20)--App数据格式之解析Json (很基本基础的入门,对json的基础讲的很好) http://my.eoe.cn/874311/archive/1936.html An ...

  8. python爬虫解析数据包_Python网络爬虫之三种数据解析方式

    引入 回顾requests实现数据爬取的流程 指定url 基于requests模块发起请求 获取响应对象中的数据 进行持久化存储 其实,在上述流程中还需要较为重要的一步,就是在持久化存储之前需要进行指 ...

  9. 在php中怎么解析json数据,php解析json数据

    在我们使用编程语言的时候,对于数据的格式会出现不能直接使用的情况,所以就会有解析的操作.在php中有专门解析json的函数,那就是json_decode().想要进一步的运用这个函数,我们还需要对它的 ...

  10. 网络爬虫——票房网数据抓取及存储

    网络爬虫--票房网数据抓取及存储 实验内容 目标网站:电影票房网 目标网址:http://58921.com/daily/wangpiao 任务要求 目标数据:(1)名次(2)电影名称 (3)日期(4 ...

最新文章

  1. symbol lookup error
  2. C++下简单的socket编程
  3. linux0775权限,Linux权限管理
  4. 软件项目开发之 软件过程RUP初探
  5. JS-this的使用
  6. 前端学习(1676):前端系列实战课程之贪吃蛇游戏设计
  7. php如何输出关联数组的值,php - 如何从PHP关联数组中获取确切的输出 - SO中文参考 - www.soinside.com...
  8. LDA(线性判别分析)详解 —— matlab
  9. [Redis]c# redis缓存辅助类
  10. java锁机制ppt_总结:Java锁机制
  11. andriod之配置文件保存与读取
  12. 彩虹自助下单平台对接爱代挂插件程序
  13. (PKCS1) RSA 公私钥 pem 文件解析
  14. 淡泊以明志,宁静而志远--诸葛亮诫子书(2007-09-30 09:35:17| 分类: 心情故事)
  15. 德语计算机相关的动词,德语常用计算机词汇汇总
  16. 硕士生写小论文的经验(转载)
  17. 【转载】我到底该不该继续交社保?
  18. 多进程与多线程的区别,和用途
  19. 机器学习中的小数学知识
  20. learn2reg-配准介绍

热门文章

  1. 使用 React-Sketchapp
  2. 跑马灯实现的三种方式
  3. 即将毕业大学生的第一个五年计划
  4. WRF4.2安装过程全记录
  5. 1453 : 当小偷遇见了悍匪
  6. Meta-learning algorithms for Few-Shot Computer Vision论文解读(三)
  7. socket多线程图形化界面聊天室实例
  8. CentOS yum安装mcrypt详细图解教程
  9. 【高精】Oliver的成绩
  10. python处理FITS文件 2:astropy.io.fits介绍及打开FITS文件