http get 请求
http post 请求

http get 请求

# 利用get请求获取相应网页，并以html形式存储
from urllib import request
url='http://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=1&tn=baidu&wd='
keyword='你好'
# 利用request.quote对中文进行编码
keyword_code=request.quote(keyword)
url_all=url+keyword_code
# 设置Request对象
reg=request.Request(url_all)
# 通过Request get内容，urlopen返回file-like object
data=request.urlopen(reg).read()
fhandle=open(r'D:/pythoncode/crawler/1.html','wb')
fhandle.write(data)
fhandle.close()

http post 请求

用于注册，登录等操

from urllib import request,parseurl='http://www.iqianyue.com/mypost/'#Convert a mapping object or a sequence of two-element tuples, which may contain str or bytes objects,
# to a percent-encoded ASCII text string. If the resultant string is to be used as a data for
# POST operation with the urlopen() function, then it should be encoded to bytes, otherwise it would
# result in a TypeError.
postdata=parse.urlencode({'name':'ceo@iqianyue.com','password':'aA123456'}).encode('utf-8')# 构建带有postdata的Request 对象
reg=request.Request(url,postdata)# 为Request添加浏览器模拟
reg.add_header( 'User-Agent','Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36')data=request.urlopen(reg).read()
fh=open(r'D:\pythoncode\crawler\2.html','wb')
fh.write(data)
fh.close()

crawler（1）相关推荐

crawler（七）:Scrapy的Request和Response、Files Pipeline、Images Pipeline
请求和响应 Scrapy的Request 和Response对象用于爬网网站. 通常,Request对象在爬虫程序中生成并传递到系统,直到它们到达下载程序,后者执行请求并返回一个Response对象, ...
Enterprise：Web Crawler 基础（一）
在 Elastic Enterprise Search 7.11 中,Elastic 宣布推出 Elastic App Search 网络爬虫,这是一种简单而强大的方式来提取公开可用的网络内容,以便在 ...
Nginx配置文件nginx.conf中文详解（转）
######Nginx配置文件nginx.conf中文详解######定义Nginx运行的用户和用户组 user www www;#nginx进程数,建议设置为等于CPU总核心数. worker_pr ...
nginx技术（2）nginx的配置详解
nginx的配置 1,启动nginx 1 2 3 4 5 6 7 [root@centos6 nginx-1.2.9]# /usr/sbin/nginx -c /etc/nginx/nginx.con ...
Nginx配置文件nginx.conf中文详解（总结）
转载自:https://www.2cto.com/os/201212/176520.html 更详细的模块参数请参考:http://wiki.nginx.org/Main #定义Nginx运行的用户和 ...
Scrapy源代码分析-经常使用的爬虫类-CrawlSpider（三）
CrawlSpider classscrapy.contrib.spiders.CrawlSpider 爬取一般站点经常使用的spider.其定义了一些规则(rule)来提供跟进link的方便的机制. ...
Golang实现简单爬虫框架（4）——队列实现并发任务调度
前言在上一篇文章<Golang实现简单爬虫框架(3)--简单并发版>中我们实现了一个最简单并发爬虫,调度器为每一个Request创建一个goroutine,每个goroutine往Wor ...
《Learning Scrapy》（中文版）第11章 Scrapyd分布式抓取和实时分析
序言第1章 Scrapy介绍第2章理解HTML和XPath 第3章爬虫基础第4章从Scrapy到移动应用第5章快速构建爬虫第6章 Scrapinghub部署第7章配置和管理第8 ...
Nginx配置文件nginx.conf详解（转）
#定义Nginx运行的用户和用户组 user www www;#nginx进程数,建议设置为等于CPU总核心数. worker_processes 8;#全局错误日志定义类型,[ debug | in ...

crawler（1）

Contents

http get 请求

http post 请求

crawler（1）相关推荐

最新文章

热门文章