1.目标网站

2.碰见__VIEWSTATE和__EVENTVALIDATION的参数

3.参数解释以及必要性

4.详解__EVENTTARGET的参数值对应不同的思路

4.1. 第一种,点击下一页的方式,PageNavigator1$LnkBtnNext

4.2 第二种,点击跳转的方式,PageNavigator1$LnkBtnGoto

5.分析__VIEWSTATE和__EVENTVALIDATION这两个参数

6.全部代码

6.1 使用点击下一页的方式

运行结果:

6.2 使用点击跳转第几页的方式(即为POST中的常规翻页,有页数的)


1.目标网站

爬取如下的链接的信息是出现有__VIEWSTATE和__EVENTVALIDATION的参数的情况:商品房信息http://180.141.32.142/htmlaspx/TMSFW-CZ/HPMS/SPFInfoList.aspx

2.碰见__VIEWSTATE和__EVENTVALIDATION的参数

这是我好几次碰见这种请求方式了,请求的POST参数如下图:

3.参数解释以及必要性

里面主要是__VIEWSTATE和__EVENTVALIDATION  这两个参数请求时候一定要带上,

还有一个__EVENTTARGET的参数,网上有些博客说要加其他的,我碰见的主要是加这三个就可以了,以防万一,加上最好,反正那些参数又不变

这个参数的作用是确定翻页的方式,一般有两种

我们爬虫的时候选择 PageNavigator1$LnkBtnNext  这个值的意思是,点击下一页的翻页方式,另外一种方式是,你直接点击第几页或者跳转到某页的数据,它的值就是PageNavigator1$LnkBtnGoto

点击下一页的值,如下图:

对应的值就为:PageNavigator1$LnkBtnNext

点击跳转到某页的值,如下图:

对应的值就为:PageNavigator1$LnkBtnGoto

而且要加上跳转到哪页的参数  PageNavigator1$txtNewPageIndex,

4.详解__EVENTTARGET的参数值对应不同的思路

首先是__EVENTTARGET的值,上面我说了有两种处理方式

4.1. 第一种,点击下一页的方式,PageNavigator1$LnkBtnNext

即就是 __EVENTTARGET的值为: PageNavigator1$LnkBtnNext

如果你选择是点击下一页的方式,那则必须,一页一页往下传递的方式,并且只需要带着

__EVENTTARGET、__VIEWSTATE和__EVENTVALIDATION这三个参数请求即可

而不要关注另外的PageNavigator1$txtNewPageIndex跳转到哪页的参数

则,请求的参数可如下方,__VIEWSTATE和__EVENTVALIDATION可以到浏览器上复制

payload = {"__EVENTTARGET": "PageNavigator1$LnkBtnGoto","__VIEWSTATE": "zOm7apMaa5ad6gjX5DhqwY5EwCwNTqMkpbtpcBkjYJHFgf0cxkF5fKPIk3MjyonK6efcaUQ+7rFw/gyvLrUJdTgPeoSY5ek1N0GKlYLGHNj82+3uL9dJ799TBgOoF0DIKN/8DsnF3bh1NvgEKzcxnxgEsjvBBLtT/V0CTu7O1y0SMFaeT2JOkrnn+Jt6F8/bFFCVlK+/IYnFWaF65SwmPhLpr1ljt2pE2eaRtjNv4gnfxuGCgVEBn70ZhdBTcuM5FMay+1AUVtfk7opAkWVvS0cOIwFHNmq5i06FDtiJdYcdvMTKjPq+l3dLTDqS3X3wJUvNTBHhBS5yfFf6+gFAi+Z5i7+QRW1d+nbiGV2vDgHKBjhJR/4Ahfswib9diZ6UrLXGCAlWs6918vEvHn0E4CYY1QrjJmL3J48ZOKs8m/4vd0bebXGzEJPce+ZbaPoBlJUFek8a8xjRoAvm/JV8vEqTUeFZhgKe5Jgcm9mPrNIX5TuP19Rhe3PSjbsOenVVojHXDk+GjcIfZFX4pnXsO5HHRl5xnebxMDrB+tm0U88/KjGLnPfeNuHz/tFKDFvP8iwrUAYjvtxk1xxDrvQ9ywGGb26U/48LKgrtPYpXtWmP6vtI1QiyfBIqNZefd3M8aK5wOLlHe/yu0eLcR1AtsXL5iHH05MrLf0DKHHlAi6ofPLhT4TJ/k5Ft3HynMYl0/VLOoHoa3F6xp4kuHTl6uT52Lq49HEH7MoDbWtoluoT4O/W7RuS3tW1nfjamhU9Y/3IcER0/CWfieT6NCOMrhiQqmiAg2qzV2JobiKkuSzqD2FUWeeymks9+x0rlqGUoGw6lKKEZDwIuKt9kQfLz36Q5OS8vhNvdlHHW/uSKGS4pLuRjK+ofsQeX8HeLlznmJ6v4iK38FnlTJxy12jwmXwHw0zD1y2RhlPG1KAu+Ge5ECwY/EsyS7DicIP/VB0AMCuwkLTOeuq8k82UKR4blaBvM734oGFrlawqW8vN7krrG6L3KhNyzbGmcDletVFhXcIczAPCactUnFzLI4Z7kS709vjSqVhU8DwxrK7EhyyuJadS3pigQ6nWbbvfJpoiv1QKB4V0T4F/vk2qptr8SIeLjtY2StQTe3mBiQjf69e55ZwYx1PV2FrJV2q8gmWeIhsFJ+vsmYQzu2FSIOuK98ppGdMmZiI0ilTXl+5lfNHc/yLn65q6OhgINLYZwqGmAhDmACW4Yx+XqoR98Ck3xdzzHM3LjhabIBmtN5gS1go9xaHIz+fcRU+7u/OG7CEY1r3laZeFcSBrtqyIh0H6MLpgQKITpMTeZM00vBo4HGX9x4amOJS057EGT9yc7bMSWdyOsvz6l/6BtweDsmBBOmE00aSPlpLPS0KzSx3wAoDIASopmiAjJ+JqrFxLDy6r6tkbjOZPfLbBeM4UOy+/p29zieoKp5m/I4+aQ1Ow1UoQ5ceFpm/CfXYePN0ef7zTwMhkfcEAXzT6Ky1ET/sVAQua1stfs0i9RdjOx6HGYMlzvuap+mKJ2BVYh02wd2TP5mfC4yjTXSB1Ac/cu/l0hT5ndt2yRcNcDrhue/2W0eYN28dzsBnQLGyysb6awqbevkdk6jpBEVMk4jzq3KYuLzI85s6IGhqDsDlG6CfjWGZOrXHalW82lG5GxQJgOW4I5Wxh4zLHIsznQvzLSwRwmuFLzhVh7VZshh3q3IIUR4eeIBR1k3ZoWlmUtscYBoefdLPIC+QlRZPU+7uoH8Nqq96eeSWOX/GyNm27qnEM4D4NEca4ItGynGendWriO19tiratsynMiby+oYLBakIeEDPQ/CLeX8REpTJY72xR0XS03sSD0SA4zRPGVkdCYtlu+FfCdILJMe9bPY4jFZNVu/ul11LPuz4G2LM4ifOCd+W54QGhU0mxh+6ZjUaV2rqqe9XNF7sMrvP6hwwb3HMezOPZzJKBO4AaRvBGHQjNJAy2XO2+io0owLz6eTbfAQbCyPMCLPC5SXfGTe0X+H/LPk45tx0H24EqYCSGeLaV4udFo+tzx3pOrPR9ku0RuwNJVkQ04nf0TMSet1Po7y0J1nX/9yj+VXzidYfOXzwViKrVz7QwyQufVbJzDKulu+ebBD6mhAyVoslo9wkg9vRjj/lUaEuFTbAEr2m8WFag/M5m9cqz823QAa0ZMXEpaF2EvzkJlDipvcd7oqhIX5Nzo2aOcw971qrKDCBT6s0NxgplAS07FFvIzbX8Ujmq0mlqVXCp3AzLy5e3QTrjTToTHITyVyOild1rPcotsAwoABUdcHHGJrzymdPwkFCcsN4UjHuntGtqtHmmMLlvpDBgnaE9sC2BezQd7V9+Cnf9f9/m7YUBGNf30tc7gYlzKlnpMF7o6++H46LyMhziTZ/oI6unIOvFXnrdzyC+6AFaKBaiywd5TjEoQI+AUT1DztkmCiD+FS+Q4a5CjzDEf7vTNjK+n2c2d33xC6lOre4et2DD5R8u1fmY5DBbf4G8miM7EjC8bLiALD33nylm2iCcCBNn4DBWdvOllZVM7pBOglokJYhroFh65cQPAolY2EyuM/tUc44iAeSYp8xPskROklfKKeR53Z0Hui5rgkl1QY1qlIC2K3kEYn5B2ThwBdCFGqlF/IL3KmAh+kL01yj+cB4pKBJ+1CDoSGSTdU6YxnG28Qh/7BR9CdhF6p5wLMGrW9RwsUFKcIl7OIwiyPeIQaKgwdAAmZus4/6J3azW98SHsgPmPR/QwYsnGpmqh5IOw2j2Zu1WKvv6tL4SdNFzqk6BF8x4jr0EGS8DoaoVNU7NYIMxqQ70KwXsYPf7Pgcm40jtktIeWHd1Mu/+ZmbH83iJMkBFvXSHts/hW7lXnlgU240h3qFq7d/sJEXbVSVGQLb9kBVuJdSmTGbSljN/4gRMndcUDtrySUSFON2iocPzzjEmTm45BVlqdAvaU7UKw5tsrqPbiSei7eIic9KkLoSjbvj+RwM+9y2TNhiY3U1nxe01TgmhtzbnO2qtAYfmH7hRB9mqb/4PeyBV8v9hH7dZs3lTtoK6MbqpQM87n2LO7ZU2KVZd2WLFpjoKL/PXnIoXPKN12T4Typmtkmc9FypkAgWHl8Qtb6rmBZozb0NeKIYQvVRnGxl/pJdAUXtJWT+//rOLiFnezLeBNp3p232fXfynkA/k2Gbys03Wezb9LHIBzLZYzzncUyesSqJ345gQbLPFltL5/3zbwC2KzLGwjSUfePaOF+M63dsHLH5wJQc+B5bFlVaE+6JhgP0USJssKDUhbpoNsukYDbnWRaLQgykvawFbUdyA/Vs/OmrN+Hk72EUX8XJzvOem0HHPPxUnNAZWddhQUCE06LkbImRnyQGsUsXsKTJ71BTDgJ2xJt0Uw9JwOsFUTXLgfw0udru+QhKiWfP9Z8FQVMDqILjMe5D8elae06pcI2mz4FULd8UpjUYNMPz7N+dbUbtDY1jNk9hrQaZvd8TmC4t4pS1VFhGQThrrh6U88rw==","__EVENTVALIDATION": "eT4WmyJjgTkHGjSisJKNF+X+cGGK5uwbsBnK4m9+kuhoPimZAZ20qtwGw+VbV1+8weNVHdj8HHGlw5pWrqD69OKcNfVQpb5OZ0IIohTvK1UhBf/gVf0eOv7hNWjOL7KTzt/Y5OUHYqD1pViBvxaCgYDNGNZpXgQ+NFdphO153hsGhoKX1mD7U8edi5k3qm3HbqgM3WQcyTKBI4gSOjNzPEM/PhgorOuP2dWliNYeSfPfCmza4vONcAU83QLef4aJm2bHIvIiqUpEHmgfEO6iKYy8Fjbe2u9uxwziaXq0gqcVujz6hddQ+Ax/9gU="}

4.2 第二种,点击跳转的方式,PageNavigator1$LnkBtnGoto

即就是__EVENTTARGET的值为:PageNavigator1$LnkBtnGoto

5.分析__VIEWSTATE和__EVENTVALIDATION这两个参数

接下来,分析__VIEWSTATE和__EVENTVALIDATION这两个参数

我们打开第一页的调试台,查看他的response,就会发现,第一个页的源码里面就有这两个参数, 如下图:

 如果不带任何参数请求第一页,不管是post还是get请求的主链接,就是post的那个链接,请求的都是第一页的数据,知道了这个,我们请求第一页的数据时候,就什么都不用带,怎样请求都可以,但最好带上headers小心被封掉, 这样请求的数据里面总是第一页的数据,然后写xpath,取出里面的我们所需要的值

VIEWSTATE = resp.xpath("//input[@name='__VIEWSTATE']/@value").extract_first()
EVENTVALIDATION = resp.xpath("//input[@name='__EVENTVALIDATION']/@value").extract_first()

6.全部代码

6.1 使用点击下一页的方式

注意:在该scrapy框架中,使用了内联请求,关于这个内联请求,我后面会专门写博客说明,可以关注我,后续会发

有scrapy框架本身就是异步框架的,原因使用内联请求,这个 inline_requests 模块的安装可以私信我,或者下方评论

这是文档,可以参考,后续我会专门写一篇关于scrapy框架内联请求的博客

Installation — Scrapy Inline Requests 0.3.1 documentationhttps://scrapy-inline-requests.readthedocs.io/en/stable/installation.html为什么用,内联因为不用内联的话,scrapy框架是随机请求,是异步的,所以,不用内联的话,点击下一页的方式就用不了,

scrapy框架中post请求要重写start_requests()方法 ,然后这些生成构造的url,所有的就相当于start_urls这个列表的作用,所以,一开始就会把所有的请求了,这样就会导致,你下一次的要用上一次网页里面解析出来的请求参数,然而无法去对应上,所以就无法去得到下一页的数据,所以要把内联加上。

注意请求所带的参数,的__EVENTTARGET的方式不能出错,点击下一页的方式不能出现跳转对应的参数,否则,一直会是第一页的数据

import json
from pprint import pprint
import scrapy
from inline_requests import inline_requestsfrom XnSpider.utils.ToolsFunction import get_html_table, modify_dict_keysclass GuangxiCzXmlSpider(scrapy.Spider):"""广西省-崇左市-项目信息五证、项目信息"""name = 'Guangxi_Cz_All'allowed_domains = ['180.141.32.142']start_urls = ['http://180.141.32.142/htmlaspx/TMSFW-CZ/HPMS/SPFInfoList.aspx']headers = {"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9","Accept-Language": "zh-CN,zh;q=0.9","Content-Type": "application/x-www-form-urlencoded","Host": "180.141.32.142",}payload = {"__EVENTTARGET": "PageNavigator1$LnkBtnNext","__VIEWSTATE": "zOm7apMaa5ad6gjX5DhqwY5EwCwNTqMkpbtpcBkjYJHFgf0cxkF5fKPIk3MjyonK6efcaUQ+7rFw/gyvLrUJdTgPeoSY5ek1N0GKlYLGHNj82+3uL9dJ799TBgOoF0DIKN/8DsnF3bh1NvgEKzcxnxgEsjvBBLtT/V0CTu7O1y0SMFaeT2JOkrnn+Jt6F8/bFFCVlK+/IYnFWaF65SwmPhLpr1ljt2pE2eaRtjNv4gnfxuGCgVEBn70ZhdBTcuM5FMay+1AUVtfk7opAkWVvS0cOIwFHNmq5i06FDtiJdYcdvMTKjPq+l3dLTDqS3X3wJUvNTBHhBS5yfFf6+gFAi+Z5i7+QRW1d+nbiGV2vDgHKBjhJR/4Ahfswib9diZ6UrLXGCAlWs6918vEvHn0E4CYY1QrjJmL3J48ZOKs8m/4vd0bebXGzEJPce+ZbaPoBlJUFek8a8xjRoAvm/JV8vEqTUeFZhgKe5Jgcm9mPrNIX5TuP19Rhe3PSjbsOenVVojHXDk+GjcIfZFX4pnXsO5HHRl5xnebxMDrB+tm0U88/KjGLnPfeNuHz/tFKDFvP8iwrUAYjvtxk1xxDrvQ9ywGGb26U/48LKgrtPYpXtWmP6vtI1QiyfBIqNZefd3M8aK5wOLlHe/yu0eLcR1AtsXL5iHH05MrLf0DKHHlAi6ofPLhT4TJ/k5Ft3HynMYl0/VLOoHoa3F6xp4kuHTl6uT52Lq49HEH7MoDbWtoluoT4O/W7RuS3tW1nfjamhU9Y/3IcER0/CWfieT6NCOMrhiQqmiAg2qzV2JobiKkuSzqD2FUWeeymks9+x0rlqGUoGw6lKKEZDwIuKt9kQfLz36Q5OS8vhNvdlHHW/uSKGS4pLuRjK+ofsQeX8HeLlznmJ6v4iK38FnlTJxy12jwmXwHw0zD1y2RhlPG1KAu+Ge5ECwY/EsyS7DicIP/VB0AMCuwkLTOeuq8k82UKR4blaBvM734oGFrlawqW8vN7krrG6L3KhNyzbGmcDletVFhXcIczAPCactUnFzLI4Z7kS709vjSqVhU8DwxrK7EhyyuJadS3pigQ6nWbbvfJpoiv1QKB4V0T4F/vk2qptr8SIeLjtY2StQTe3mBiQjf69e55ZwYx1PV2FrJV2q8gmWeIhsFJ+vsmYQzu2FSIOuK98ppGdMmZiI0ilTXl+5lfNHc/yLn65q6OhgINLYZwqGmAhDmACW4Yx+XqoR98Ck3xdzzHM3LjhabIBmtN5gS1go9xaHIz+fcRU+7u/OG7CEY1r3laZeFcSBrtqyIh0H6MLpgQKITpMTeZM00vBo4HGX9x4amOJS057EGT9yc7bMSWdyOsvz6l/6BtweDsmBBOmE00aSPlpLPS0KzSx3wAoDIASopmiAjJ+JqrFxLDy6r6tkbjOZPfLbBeM4UOy+/p29zieoKp5m/I4+aQ1Ow1UoQ5ceFpm/CfXYePN0ef7zTwMhkfcEAXzT6Ky1ET/sVAQua1stfs0i9RdjOx6HGYMlzvuap+mKJ2BVYh02wd2TP5mfC4yjTXSB1Ac/cu/l0hT5ndt2yRcNcDrhue/2W0eYN28dzsBnQLGyysb6awqbevkdk6jpBEVMk4jzq3KYuLzI85s6IGhqDsDlG6CfjWGZOrXHalW82lG5GxQJgOW4I5Wxh4zLHIsznQvzLSwRwmuFLzhVh7VZshh3q3IIUR4eeIBR1k3ZoWlmUtscYBoefdLPIC+QlRZPU+7uoH8Nqq96eeSWOX/GyNm27qnEM4D4NEca4ItGynGendWriO19tiratsynMiby+oYLBakIeEDPQ/CLeX8REpTJY72xR0XS03sSD0SA4zRPGVkdCYtlu+FfCdILJMe9bPY4jFZNVu/ul11LPuz4G2LM4ifOCd+W54QGhU0mxh+6ZjUaV2rqqe9XNF7sMrvP6hwwb3HMezOPZzJKBO4AaRvBGHQjNJAy2XO2+io0owLz6eTbfAQbCyPMCLPC5SXfGTe0X+H/LPk45tx0H24EqYCSGeLaV4udFo+tzx3pOrPR9ku0RuwNJVkQ04nf0TMSet1Po7y0J1nX/9yj+VXzidYfOXzwViKrVz7QwyQufVbJzDKulu+ebBD6mhAyVoslo9wkg9vRjj/lUaEuFTbAEr2m8WFag/M5m9cqz823QAa0ZMXEpaF2EvzkJlDipvcd7oqhIX5Nzo2aOcw971qrKDCBT6s0NxgplAS07FFvIzbX8Ujmq0mlqVXCp3AzLy5e3QTrjTToTHITyVyOild1rPcotsAwoABUdcHHGJrzymdPwkFCcsN4UjHuntGtqtHmmMLlvpDBgnaE9sC2BezQd7V9+Cnf9f9/m7YUBGNf30tc7gYlzKlnpMF7o6++H46LyMhziTZ/oI6unIOvFXnrdzyC+6AFaKBaiywd5TjEoQI+AUT1DztkmCiD+FS+Q4a5CjzDEf7vTNjK+n2c2d33xC6lOre4et2DD5R8u1fmY5DBbf4G8miM7EjC8bLiALD33nylm2iCcCBNn4DBWdvOllZVM7pBOglokJYhroFh65cQPAolY2EyuM/tUc44iAeSYp8xPskROklfKKeR53Z0Hui5rgkl1QY1qlIC2K3kEYn5B2ThwBdCFGqlF/IL3KmAh+kL01yj+cB4pKBJ+1CDoSGSTdU6YxnG28Qh/7BR9CdhF6p5wLMGrW9RwsUFKcIl7OIwiyPeIQaKgwdAAmZus4/6J3azW98SHsgPmPR/QwYsnGpmqh5IOw2j2Zu1WKvv6tL4SdNFzqk6BF8x4jr0EGS8DoaoVNU7NYIMxqQ70KwXsYPf7Pgcm40jtktIeWHd1Mu/+ZmbH83iJMkBFvXSHts/hW7lXnlgU240h3qFq7d/sJEXbVSVGQLb9kBVuJdSmTGbSljN/4gRMndcUDtrySUSFON2iocPzzjEmTm45BVlqdAvaU7UKw5tsrqPbiSei7eIic9KkLoSjbvj+RwM+9y2TNhiY3U1nxe01TgmhtzbnO2qtAYfmH7hRB9mqb/4PeyBV8v9hH7dZs3lTtoK6MbqpQM87n2LO7ZU2KVZd2WLFpjoKL/PXnIoXPKN12T4Typmtkmc9FypkAgWHl8Qtb6rmBZozb0NeKIYQvVRnGxl/pJdAUXtJWT+//rOLiFnezLeBNp3p232fXfynkA/k2Gbys03Wezb9LHIBzLZYzzncUyesSqJ345gQbLPFltL5/3zbwC2KzLGwjSUfePaOF+M63dsHLH5wJQc+B5bFlVaE+6JhgP0USJssKDUhbpoNsukYDbnWRaLQgykvawFbUdyA/Vs/OmrN+Hk72EUX8XJzvOem0HHPPxUnNAZWddhQUCE06LkbImRnyQGsUsXsKTJ71BTDgJ2xJt0Uw9JwOsFUTXLgfw0udru+QhKiWfP9Z8FQVMDqILjMe5D8elae06pcI2mz4FULd8UpjUYNMPz7N+dbUbtDY1jNk9hrQaZvd8TmC4t4pS1VFhGQThrrh6U88rw==","__EVENTVALIDATION": "eT4WmyJjgTkHGjSisJKNF+X+cGGK5uwbsBnK4m9+kuhoPimZAZ20qtwGw+VbV1+8weNVHdj8HHGlw5pWrqD69OKcNfVQpb5OZ0IIohTvK1UhBf/gVf0eOv7hNWjOL7KTzt/Y5OUHYqD1pViBvxaCgYDNGNZpXgQ+NFdphO153hsGhoKX1mD7U8edi5k3qm3HbqgM3WQcyTKBI4gSOjNzPEM/PhgorOuP2dWliNYeSfPfCmza4vONcAU83QLef4aJm2bHIvIiqUpEHmgfEO6iKYy8Fjbe2u9uxwziaXq0gqcVujz6hddQ+Ax/9gU=",}@inline_requestsdef parse(self, response, **kwargs):for i in range(1, 9):  # 9if i == 1:resp = yield scrapy.Request(url=self.start_urls[0], dont_filter=True, headers=self.headers)VIEWSTATE = resp.xpath("//input[@name='__VIEWSTATE']/@value").extract_first()EVENTVALIDATION = resp.xpath("//input[@name='__EVENTVALIDATION']/@value").extract_first()self.payload['__VIEWSTATE'] = VIEWSTATEself.payload['__EVENTVALIDATION'] = EVENTVALIDATIONelse:resp = yield scrapy.FormRequest(url=self.start_urls[0], dont_filter=True,formdata=self.payload, headers=self.headers)VIEWSTATE = resp.xpath("//input[@name='__VIEWSTATE']/@value").extract_first()EVENTVALIDATION = resp.xpath("//input[@name='__EVENTVALIDATION']/@value").extract_first()self.payload['__VIEWSTATE'] = VIEWSTATEself.payload['__EVENTVALIDATION'] = EVENTVALIDATIONpro_name = resp.xpath("//div[@class='resultlist']//table//tr[position()>2]//td[2]//text()").extract()print(pro_name)print("*" * 60)

运行结果:

6.2 使用点击跳转第几页的方式(即为POST中的常规翻页,有页数的)

因为有页码的限制,所以不要内联请求的方式

import json
from pprint import pprint
import scrapy
from inline_requests import inline_requestsfrom XnSpider.utils.ToolsFunction import get_html_table, modify_dict_keysclass GuangxiCzXmlSpider(scrapy.Spider):"""广西省-崇左市-项目信息五证、项目信息"""name = 'Guangxi_Cz_All'allowed_domains = ['180.141.32.142']start_urls = ['http://180.141.32.142/htmlaspx/TMSFW-CZ/HPMS/SPFInfoList.aspx']headers = {"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9","Accept-Language": "zh-CN,zh;q=0.9","Content-Type": "application/x-www-form-urlencoded","Host": "180.141.32.142",}payload = {"__EVENTTARGET": "PageNavigator1$LnkBtnGoto","__VIEWSTATE": "zOm7apMaa5ad6gjX5DhqwY5EwCwNTqMkpbtpcBkjYJHFgf0cxkF5fKPIk3MjyonK6efcaUQ+7rFw/gyvLrUJdTgPeoSY5ek1N0GKlYLGHNj82+3uL9dJ799TBgOoF0DIKN/8DsnF3bh1NvgEKzcxnxgEsjvBBLtT/V0CTu7O1y0SMFaeT2JOkrnn+Jt6F8/bFFCVlK+/IYnFWaF65SwmPhLpr1ljt2pE2eaRtjNv4gnfxuGCgVEBn70ZhdBTcuM5FMay+1AUVtfk7opAkWVvS0cOIwFHNmq5i06FDtiJdYcdvMTKjPq+l3dLTDqS3X3wJUvNTBHhBS5yfFf6+gFAi+Z5i7+QRW1d+nbiGV2vDgHKBjhJR/4Ahfswib9diZ6UrLXGCAlWs6918vEvHn0E4CYY1QrjJmL3J48ZOKs8m/4vd0bebXGzEJPce+ZbaPoBlJUFek8a8xjRoAvm/JV8vEqTUeFZhgKe5Jgcm9mPrNIX5TuP19Rhe3PSjbsOenVVojHXDk+GjcIfZFX4pnXsO5HHRl5xnebxMDrB+tm0U88/KjGLnPfeNuHz/tFKDFvP8iwrUAYjvtxk1xxDrvQ9ywGGb26U/48LKgrtPYpXtWmP6vtI1QiyfBIqNZefd3M8aK5wOLlHe/yu0eLcR1AtsXL5iHH05MrLf0DKHHlAi6ofPLhT4TJ/k5Ft3HynMYl0/VLOoHoa3F6xp4kuHTl6uT52Lq49HEH7MoDbWtoluoT4O/W7RuS3tW1nfjamhU9Y/3IcER0/CWfieT6NCOMrhiQqmiAg2qzV2JobiKkuSzqD2FUWeeymks9+x0rlqGUoGw6lKKEZDwIuKt9kQfLz36Q5OS8vhNvdlHHW/uSKGS4pLuRjK+ofsQeX8HeLlznmJ6v4iK38FnlTJxy12jwmXwHw0zD1y2RhlPG1KAu+Ge5ECwY/EsyS7DicIP/VB0AMCuwkLTOeuq8k82UKR4blaBvM734oGFrlawqW8vN7krrG6L3KhNyzbGmcDletVFhXcIczAPCactUnFzLI4Z7kS709vjSqVhU8DwxrK7EhyyuJadS3pigQ6nWbbvfJpoiv1QKB4V0T4F/vk2qptr8SIeLjtY2StQTe3mBiQjf69e55ZwYx1PV2FrJV2q8gmWeIhsFJ+vsmYQzu2FSIOuK98ppGdMmZiI0ilTXl+5lfNHc/yLn65q6OhgINLYZwqGmAhDmACW4Yx+XqoR98Ck3xdzzHM3LjhabIBmtN5gS1go9xaHIz+fcRU+7u/OG7CEY1r3laZeFcSBrtqyIh0H6MLpgQKITpMTeZM00vBo4HGX9x4amOJS057EGT9yc7bMSWdyOsvz6l/6BtweDsmBBOmE00aSPlpLPS0KzSx3wAoDIASopmiAjJ+JqrFxLDy6r6tkbjOZPfLbBeM4UOy+/p29zieoKp5m/I4+aQ1Ow1UoQ5ceFpm/CfXYePN0ef7zTwMhkfcEAXzT6Ky1ET/sVAQua1stfs0i9RdjOx6HGYMlzvuap+mKJ2BVYh02wd2TP5mfC4yjTXSB1Ac/cu/l0hT5ndt2yRcNcDrhue/2W0eYN28dzsBnQLGyysb6awqbevkdk6jpBEVMk4jzq3KYuLzI85s6IGhqDsDlG6CfjWGZOrXHalW82lG5GxQJgOW4I5Wxh4zLHIsznQvzLSwRwmuFLzhVh7VZshh3q3IIUR4eeIBR1k3ZoWlmUtscYBoefdLPIC+QlRZPU+7uoH8Nqq96eeSWOX/GyNm27qnEM4D4NEca4ItGynGendWriO19tiratsynMiby+oYLBakIeEDPQ/CLeX8REpTJY72xR0XS03sSD0SA4zRPGVkdCYtlu+FfCdILJMe9bPY4jFZNVu/ul11LPuz4G2LM4ifOCd+W54QGhU0mxh+6ZjUaV2rqqe9XNF7sMrvP6hwwb3HMezOPZzJKBO4AaRvBGHQjNJAy2XO2+io0owLz6eTbfAQbCyPMCLPC5SXfGTe0X+H/LPk45tx0H24EqYCSGeLaV4udFo+tzx3pOrPR9ku0RuwNJVkQ04nf0TMSet1Po7y0J1nX/9yj+VXzidYfOXzwViKrVz7QwyQufVbJzDKulu+ebBD6mhAyVoslo9wkg9vRjj/lUaEuFTbAEr2m8WFag/M5m9cqz823QAa0ZMXEpaF2EvzkJlDipvcd7oqhIX5Nzo2aOcw971qrKDCBT6s0NxgplAS07FFvIzbX8Ujmq0mlqVXCp3AzLy5e3QTrjTToTHITyVyOild1rPcotsAwoABUdcHHGJrzymdPwkFCcsN4UjHuntGtqtHmmMLlvpDBgnaE9sC2BezQd7V9+Cnf9f9/m7YUBGNf30tc7gYlzKlnpMF7o6++H46LyMhziTZ/oI6unIOvFXnrdzyC+6AFaKBaiywd5TjEoQI+AUT1DztkmCiD+FS+Q4a5CjzDEf7vTNjK+n2c2d33xC6lOre4et2DD5R8u1fmY5DBbf4G8miM7EjC8bLiALD33nylm2iCcCBNn4DBWdvOllZVM7pBOglokJYhroFh65cQPAolY2EyuM/tUc44iAeSYp8xPskROklfKKeR53Z0Hui5rgkl1QY1qlIC2K3kEYn5B2ThwBdCFGqlF/IL3KmAh+kL01yj+cB4pKBJ+1CDoSGSTdU6YxnG28Qh/7BR9CdhF6p5wLMGrW9RwsUFKcIl7OIwiyPeIQaKgwdAAmZus4/6J3azW98SHsgPmPR/QwYsnGpmqh5IOw2j2Zu1WKvv6tL4SdNFzqk6BF8x4jr0EGS8DoaoVNU7NYIMxqQ70KwXsYPf7Pgcm40jtktIeWHd1Mu/+ZmbH83iJMkBFvXSHts/hW7lXnlgU240h3qFq7d/sJEXbVSVGQLb9kBVuJdSmTGbSljN/4gRMndcUDtrySUSFON2iocPzzjEmTm45BVlqdAvaU7UKw5tsrqPbiSei7eIic9KkLoSjbvj+RwM+9y2TNhiY3U1nxe01TgmhtzbnO2qtAYfmH7hRB9mqb/4PeyBV8v9hH7dZs3lTtoK6MbqpQM87n2LO7ZU2KVZd2WLFpjoKL/PXnIoXPKN12T4Typmtkmc9FypkAgWHl8Qtb6rmBZozb0NeKIYQvVRnGxl/pJdAUXtJWT+//rOLiFnezLeBNp3p232fXfynkA/k2Gbys03Wezb9LHIBzLZYzzncUyesSqJ345gQbLPFltL5/3zbwC2KzLGwjSUfePaOF+M63dsHLH5wJQc+B5bFlVaE+6JhgP0USJssKDUhbpoNsukYDbnWRaLQgykvawFbUdyA/Vs/OmrN+Hk72EUX8XJzvOem0HHPPxUnNAZWddhQUCE06LkbImRnyQGsUsXsKTJ71BTDgJ2xJt0Uw9JwOsFUTXLgfw0udru+QhKiWfP9Z8FQVMDqILjMe5D8elae06pcI2mz4FULd8UpjUYNMPz7N+dbUbtDY1jNk9hrQaZvd8TmC4t4pS1VFhGQThrrh6U88rw==","__EVENTVALIDATION": "eT4WmyJjgTkHGjSisJKNF+X+cGGK5uwbsBnK4m9+kuhoPimZAZ20qtwGw+VbV1+8weNVHdj8HHGlw5pWrqD69OKcNfVQpb5OZ0IIohTvK1UhBf/gVf0eOv7hNWjOL7KTzt/Y5OUHYqD1pViBvxaCgYDNGNZpXgQ+NFdphO153hsGhoKX1mD7U8edi5k3qm3HbqgM3WQcyTKBI4gSOjNzPEM/PhgorOuP2dWliNYeSfPfCmza4vONcAU83QLef4aJm2bHIvIiqUpEHmgfEO6iKYy8Fjbe2u9uxwziaXq0gqcVujz6hddQ+Ax/9gU=","PageNavigator1$txtNewPageIndex": "1"}def start_requests(self):for i in range(1, 9):if i == 1:yield scrapy.Request(url=self.start_urls[0], dont_filter=True, headers=self.headers,callback=self.parse)else:self.payload['PageNavigator1$txtNewPageIndex'] = str(i)yield scrapy.FormRequest(url=self.start_urls[0], dont_filter=True, headers=self.headers,callback=self.parse, formdata=self.payload)# @inline_requestsdef parse(self, response, **kwargs):pro_name = response.xpath("//div[@class='resultlist']//table//tr[position()>2]//td[2]//text()").extract()print(pro_name)print("*" * 60)

运行结果:

打印结果也可以看出来,是随机的,scrapy引擎处理的请求

Python的scrapy框架POST方式爬虫时碰见__VIEWSTATE和__EVENTVALIDATION的参数处理相关推荐

  1. 使用Python的Scrapy框架编写web爬虫的简单示例

    2019独角兽企业重金招聘Python工程师标准>>> 在这个教材中,我们假定你已经安装了Scrapy.假如你没有安装,你可以参考这个安装指南. 我们将会用开放目录项目(dmoz)作 ...

  2. python安装scrapy框架命令_python爬虫中scrapy框架是否安装成功及简单创建

    判断框架是否安装成功,在新建的爬虫文件夹下打开盘符中框输入cmd,在命令中输入scrapy,若显示如下图所示,则说明成功安装爬虫框架: 查看当前版本:在刚刚打开的命令框内输入scrapy versio ...

  3. Crawler之Scrapy:Python实现scrapy框架爬虫两个网址下载网页内容信息

    Crawler之Scrapy:Python实现scrapy框架爬虫两个网址下载网页内容信息 目录 输出结果 实现代码 输出结果 后期更新-- 实现代码 import scrapy class Dmoz ...

  4. Python的Scrapy框架入门教程

    前言: Scrapy是一个基于Python的Web爬虫框架,可以快速方便地从互联网上获取数据并进行处理.它的设计思想是基于Twisted异步网络框架,可以同时处理多个请求,并且可以使用多种处理数据的方 ...

  5. 爬取中国最好大学网数据(Python的Scrapy框架与Xpath联合运用)

    前言        大二上学期学校外出实习,做了一个关于爬取中国最好大学网http://www.zuihaodaxue.com/rankings.html的项目用的这个Scrapy框架,多线程还挺好用 ...

  6. 利用python的scrapy框架爬取google搜索结果页面内容

    scrapy google search 实验目的 爬虫实习的项目1,利用python的scrapy框架爬取google搜索结果页面内容. https://github.com/1012598167/ ...

  7. 使用python的scrapy框架简单的爬取豆瓣读书top250

    使用python的scrapy框架简单的爬取豆瓣读书top250 一.配置scrapy环境 1. 配置相应模块 如果没有配置过scrapy环境的一般需要安装lxml.PyOpenssl.Twisted ...

  8. scrapy框架下pythom爬虫的数据库(MYSQL)

    本次主要讲述在scrapy框架下pythom爬虫有关mysql数据库的相关内容. 首先在MySQL数据库中创建对应的表,注意字段的设计! 数据库的信息存在setting 里,数据信息host,data ...

  9. python的Scrapy框架安装报错:building 'twisted.test.raiser' extension error

    python的Scrapy框架安装报错:building 'twisted.test.raiser' extension 在https://www.lfd.uci.edu/~gohlke/python ...

最新文章

  1. 本科计算机学渣,2017计算机老学渣的经验教训
  2. 少侠请重新来过 - Vue学习笔记(八) - Vuex
  3. linux opencv cmake,OpenCV基础篇之使用CMake管理工程
  4. 前嗅ForeSpider教程:创建模板
  5. 禁止word另存为,禁止图片另存为excel禁止另存为
  6. HTTP返回代码代表的含义(403,404,500,502,504)
  7. 在idea中配置jetty
  8. 使用爬虫下载x书视频
  9. 使用js正则匹配和替换淘口令边界
  10. 无法打开excel powermap 三维地图
  11. mac 下 jkl 按键失灵
  12. OpenGL ES FBO
  13. 一个 Python Bug 干倒了估值 1.6 亿美元的公司
  14. 《C++ Primer Plus》第八章习题与参考答案
  15. shp文件中polyline是什么_polyline怎么读用法大全_polyline是什么意思
  16. Ext.TabPanel 各属性一览
  17. JSP爱心宠物诊所系统设计与实现
  18. AOT(超前编译)实例分析
  19. 读书笔记014:《伤寒论》- 足厥阴肝经
  20. 计算机读心术的原理,读心术的原理笑容的奥秘

热门文章

  1. 崩坏3服务器维护多久,崩坏35月28日停服维护多久?4.0版本更新内容汇总[图]
  2. mybatis实现自定义SQL并且请求参数是集合
  3. FineReport——设计时无相关数据库查看权限,使用对应数据库解决方案
  4. Linux(CentOS 7)——阿里云 云服务器 ECS上Apache服务器安装与配置
  5. XCTF WEB weak_auth
  6. Just a Simple Problem
  7. mysql 5.7 引擎_MySQL 5.7 学习:功能性能的提升
  8. 关闭防火墙命令 500 OOPS: cannot change directory:
  9. C++阶段01笔记05【数组(概述、一维数组、二维数组)】
  10. nfs服务启动失败:Failed to start NFS status monitor for NFSv2/3 locking..