java httpclient 采集_使用java HttpClient 与Web服务器交互 - elliott - 博客园
下面程序想自动登陆系统:
public static void main(String[] args) throws Exception{
HttpClient client = new HttpClient();
client.getHostConfiguration().setHost("127.0.0.1", 80,"http");
client.getParams().setCookiePolicy(CookiePolicy.BROWSER_COMPATIBILITY);
PostMethod post = new PostMethod("/login/login.asp");
NameValuePair name = new NameValuePair("loginname",
"kingseo@163.com");
NameValuePair pass = new NameValuePair("password", "123456");
NameValuePair int_count = new NameValuePair("int_count", "999");
NameValuePair bkurl = new NameValuePair("bkurl=", "");
post.setRequestHeader("Referer",
"http://127.0.0.1/login/login.asp");
post.setRequestHeader("ContentType",
"application/x-www-form-urlencoded");
post.setRequestHeader("Content-Length", "62");
post.setRequestHeader("Cookie",
"JSsUserInfo=382C3D7551694479056D1D79446C58754A345C2C4E755D694E79786D6279486C02751034032C1E751F6910795B6D6C79446C5F754A34432C1A75036918790F6D79793B6C54754234232C377551694E79606D6179486C52753C34202C447507691C795A6D4B79066C0C7516342D2C48755A6946791A6D4F791A6C047542343E2C2D75516944790F6D6D79216C54754C34402C487555695579056D19794F6C5B754834562C3A7528694879046D1779206C3D7544345C2C427539693479096D6679276C587548345C2C48755F694479036D1D79436C52753834212C44755C694E798;");
post.setRequestBody(new
NameValuePair[]{name,pass,int_count,bkurl});
int statuscode = client.executeMethod(post);
if ((statuscode == HttpStatus.SC_MOVED_TEMPORARILY) ¦ ¦
(statuscode == HttpStatus.SC_MOVED_PERMANENTLY) ¦ ¦
(statuscode == HttpStatus.SC_SEE_OTHER) ¦ ¦
(statuscode == HttpStatus.SC_TEMPORARY_REDIRECT)){
Header header = post.getResponseHeader("location");
if (header != null) {
String newurl = header.getValue();
if ((newurl == null) ¦ ¦ (newurl.equals("")))
newurl = "/";
GetMethod get = new GetMethod(newurl);
client.executeMethod(get);
BufferedReader bf = new BufferedReader(new
InputStreamReader(get.getResponseBodyAsStream()));
String s = null;
while ((s=bf.readLine()) != null){
System.out.println(s);
}
get.releaseConnection();
} else{
System.out.println("Invalid redirect");
}
post.releaseConnection();
}else{
System.out.println("ok");
}
}
通过ieHTTPHeaders看到以下信息:
POST /login/login.asp?DYWE=1210926948718.119132.1211187954.1211188187.8
HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/x-shockwave-flash, application/vnd.ms-excel,
application/vnd.ms-powerpoint, application/msword, */*
Referer: http://127.0.0.1/login/login.asp
Accept-Language: zh-cn
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Host: 127.0.0.1
Content-Length: 62
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: strloginusertype=4;
__zpWAM=1210926948718.119132.1211187954.1211188187.8;
firstchannelurl=http%3A//127.0.0.1/login/login.asp%3FBkUrl%3D%252Fmyzhaopin%252Fresume%255Fnav%252Easp%253FDYWE%253D1210926948718%252E119132%252E1210927448%252E1211158566%252E3%2526firstRef%253D%252D;
lastchannelurl=; JSShowname=kingseo%40163%2Ecom;
JSloginnamecookieindex=kingseo%40163%2Ecom; myzl111113171=0;
__zpWAMs1=1; __zpWAMs2=1
loginname=kingseo@163.com&password=123456&int_count=999&bkurl=
HTTP/1.0 302 Moved Temporarily
Date: Mon, 19 May 2008 09:14:22 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Location: /person/resume_index.asp?t=62062.7
Content-Length: 158
Content-Type: text/html; Charset=utf-8
Expires: Mon, 19 May 2008 09:14:22 GMT
Set-Cookie: JSpUserInfo=; domain=127.0.0.1; path=/
Set-Cookie:
JSsUserInfo=342C713654365A611D350769076B502C063658365C611D350D697A6B2F2C08360236026142355169456B042C5A3629365A611A350569186B022C5A3604365061793578690B6B5A2C7B3627365661173562697B6B5C2C0E362C36266111355D695F6B0F2C52361A360E6143357669076B572C06364736086143355B690D6B322C613654365A6117357769626B5C2C003644365A6115351669076B562C0F3658365F6117357569726B5C2C053652363E6178350B69076B5A2C60362836566166356469076B502C0436583658611D350169076B572C0E3628362761113506690D6B9;
domain=127.0.0.1; path=/
Set-Cookie: JSShowname=kingseo%40163%2Ecom; expires=Tue, 31-Dec-2019
16:00:00 GMT; domain=127.0.0.1; path=/
Set-Cookie: JSloginnamecookieindex=kingseo%40163%2Ecom; expires=Tue,
31-Dec-2019 16:00:00 GMT; domain=127.0.0.1; path=/
Set-Cookie: strloginusertype=1; domain=127.0.0.1; path=/
Cache-Control: private
X-Cache: MISS from web-s57.127.0.0.1
X-Cache-Lookup: MISS from web-s57.127.0.0.1:80
Connection: keep-alive
GET /person/resume_index.asp?t=62062.7 HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/x-shockwave-flash, application/vnd.ms-excel,
application/vnd.ms-powerpoint, application/msword, */*
Referer: http://127.0.0.1/login/login.asp
Accept-Language: zh-cn
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Host: 127.0.0.1
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: strloginusertype=1;
__zpWAM=1210926948718.119132.1211187954.1211188187.8;
firstchannelurl=http%3A//127.0.0.1/login/login.asp%3FBkUrl%3D%252Fmyzhaopin%252Fresume%255Fnav%252Easp%253FDYWE%253D1210926948718%252E119132%252E1210927448%252E1211158566%252E3%2526firstRef%253D%252D;
lastchannelurl=; JSShowname=kingseo%40163%2Ecom;
JSloginnamecookieindex=kingseo%40163%2Ecom; myzl111113171=0;
__zpWAMs1=1; __zpWAMs2=1; JSpUserInfo=;
JSsUserInfo=342C713654365A611D350769076B502C063658365C611D350D697A6B2F2C08360236026142355169456B042C5A3629365A611A350569186B022C5A3604365061793578690B6B5A2C7B3627365661173562697B6B5C2C0E362C36266111355D695F6B0F2C52361A360E6143357669076B572C06364736086143355B690D6B322C613654365A6117357769626B5C2C003644365A6115351669076B562C0F3658365F6117357569726B5C2C053652363E6178350B69076B5A2C60362836566166356469076B502C0436583658611D350169076B572C0E3628362761113506690D6B9
HTTP/1.0 200 OK
Date: Mon, 19 May 2008 09:14:22 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Content-Type: text/html; Charset=utf-8
Set-Cookie:
JSsUserInfo=347320664E73566A466559614264577340675B77526850735F663F73296A4A6503611A6408731467197700680E7324664273516A44654661106409731E67517730682F7359664873296A3965556148642473376757775E6823732A664E735C6A236525614E645D733667277758680A730D661D73006A04650D611C64267342675C7756684F7307661C730A6A4C653B6127645B734267517724683573596646734A6A466551615364577344675077546855735F663773236A4A655361376428734E67517729682C7359664873246A3365556143645D7332672677586851735F662673266A4A6522612164577342675B775468527355664473566A41655361306427734E6751772168207359664873326A2365556142645D732;
domain=127.0.0.1; path=/
Set-Cookie: monitorlogin=Y; path=/
Set-Cookie: strloginusertype=4; expires=Tue, 31-Dec-2019 16:00:00 GMT;
domain=127.0.0.1; path=/
Cache-Control: private
X-Cache: MISS from web-s46.127.0.0.1
X-Cache-Lookup: MISS from web-s46.127.0.0.1:80
Via: 1.0 web-s46.127.0.0.1 (squid/3.0.STABLE4)
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 3477
java httpclient 采集_使用java HttpClient 与Web服务器交互 - elliott - 博客园相关推荐
- 回答阿里社招面试如何准备,顺便谈谈对于Java程序猿学习当中各个阶段的建议 - 左潇龙 - 博客园...
引言 其实本来真的没打算写这篇文章,主要是LZ得记忆力不是很好,不像一些记忆力强的人,面试完以后,几乎能把自己和面试官的对话都给记下来.LZ自己当初面试完以后,除了记住一些聊过的知识点以外,具体的内容 ...
- java求数组中满足给定和的数对_关于数组的几道面试题 - zdd - 博客园
2011年2月15日更新,加入找出绝对值最小的元素一题 数组是最基本的数据结构,关于数组的面试题也屡见不鲜,本文罗列了一些常见的面试题,仅供参考,如果您有更好的题目或者想法,欢迎留言讨论.目前有以下1 ...
- java xpath 函数,XPath常用定位节点元素语句总结 - 中国龍 - 博客园
将一个XML或HTML文档转换成了DOM树结构后,如何才能定位到特定的节点?XPath实现了这样的功能,它通过DOM树中节点的路径和属性来导航,通过XPath路径表达式可以选择DOM树中的nodes( ...
- Java微信公众号开发-外网映射工具配置 - 星星满天 - 博客园
一.开发环境准备 1.一个微信公众号 2.外网映射工具(开发调试)如花生壳.ngrok工具 注:与微信对接的URL要具备以下条件a:在公网上能够访问 b:端口只支持80端口 这里使用ngrok.cc: ...
- JAVA爬虫实践(实践二:博客园)
分析博客园网站的请求可以发现,博客园的分页请求为POST方式,和知乎的滚动加载类似. 不同的是请求响应返回的是HTML而不是JSON. 这样可以套用上一篇爬知乎的代码,需要修改的部分就是POST方法传 ...
- java爬虫之爬取博客园推荐文章列表
这几天学习了一下Java爬虫的知识,分享并记录一下: 写一个可以爬取博客园十天推荐排行的文章列表 通过浏览器查看下一页点击请求,可以发现 在点击下一页的时候是执行的 post请求,请求地址为 http ...
- java 模拟登陆exe_Java简单模拟登陆和爬虫实例---博客园老牛大讲堂
鉴于有人说讲的不清楚,我这里再详细补充一下:更新日期:2017-11-23 本片文章适合初学者,只简单说了一下爬虫怎么用,和一个简单的小实例.不适合你的就可以不看了.----博客园老牛大讲堂 1.什么 ...
- java 使用webmagic 爬虫框架爬取博客园数据
java 使用webmagic 爬虫框架爬取博客园数据存入数据库 学习记录 webmagic简介: WebMagic是一个简单灵活的Java爬虫框架.你可以快速开发出一个高效.易维护的爬虫. ht ...
- Java集合和泛型练习及面试题——博客园:师妹开讲啦
给定一段JAVA代码如下:要打印出list中存储的内容,以下语句正确的是( B ) ArrayList list = new ArrayList( ) list.add("a") ...
最新文章
- 软件seqtk的使用
- supesite之空间篇
- MySQL 打开federated存储引擎
- 我的Linux系统入坑之路!!!!
- Log4j的组件和配置文件介绍
- 微信小程序下拉刷新列表onPullDownRefresh;微信小程序上划加载列表onReachBottom;uni-app微信小程序下拉加载数据;uni-app微信小程序上划页面加载数据
- 用纯css3和html制作一些泡沫对话框
- pcf8563c语言程序,PCF8563驱动程序(C语言版)
- 车牌号识别 OpenCV
- 安装Ubuntu系统时硬盘分区最合理的方法
- Pytorch对梯度进行rescale
- java guardedby_Oracle官方并发教程之Guarded Blocks
- pdf加页码java_Java 添加页码到PDF文档
- PDF压缩有哪些方法?用迅读PDF大师,压缩清晰无损
- 《商君列传第八》–读书总结
- FIL到底是什么?IPFS是什么?IPFS和FIL是什么关系?FIL参与方式
- 本文将要讨论Objective-C中的方法替换(method replacement)和swizzling(移魂大法)。
- 你能说更多关于崩坏3琪亚娜的细节吗
- 主宰操作系统的经典算法
- 还在收集资料?我这里有个github汇总