Scrapyd官方文档: https://scrapyd.readthedocs.io/en/latest/api.html


API

The following section describes the available resources in Scrapyd JSON API.

以下部分描述了Scrapyd JSON API中的可用资源。

1、daemonstatus.json

To check the load status of a service.
译:检查服务的负载状态。

  • Supported Request Methods: GET

Example request:

curl http://localhost:6800/daemonstatus.json

Example response:

{ "status": "ok", "running": "0", "pending": "0", "finished": "0", "node_name": "node-name" }

2、addversion.json

Add a version to a project, creating the project if it doesn’t exist.
译:添加项目的新版本到此Scrapy服务器的项目列表中,如果项目不存在则创建此项目。

  • Supported Request Methods: POST
  • Parameters:
    • project (string, required) - the project name
    • version (string, required) - the project version
    • egg (file, required) - a Python egg containing the project’s code

Example request:

$ curl http://localhost:6800/addversion.json -F project=myproject -F version=r23 -F egg=@myproject.egg

Example response:

{"status": "ok", "spiders": 3}

Note
Scrapyd uses the distutils LooseVersion to interpret the version numbers you provide.

The latest version for a project will be used by default whenever necessary.

schedule.json and listspiders.json allow you to explicitly set the desired project version.

3、schedule.json

Schedule a spider run (also known as a job), returning the job id.
译:启动服务器上某一爬虫,返回作业ID。

  • Supported Request Methods: POST
  • Parameters:
    • project (string, required) - the project name
    • spider (string, required) - the spider name
    • setting (string, optional) - a Scrapy setting to use when running the spider
    • jobid (string, optional) - a job id used to identify the job, overrides the default generated UUID
    • _version (string, optional) - the version of the project to use
    • any other parameter is passed as spider argument

Example request:

$ curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider

Example response:

{"status": "ok", "jobid": "6487ec79947edab326d6db28a2d86511e8247444"}

Example request passing a spider argument (arg1) and a setting (DOWNLOAD_DELAY):

$ curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider -d setting=DOWNLOAD_DELAY=2 -d arg1=val1

Note
Spiders scheduled with scrapyd should allow for an arbitrary number of keyword arguments as scrapyd sends internally generated spider arguments to the spider being scheduled

4、cancel.json

New in version 0.15.

Cancel a spider run (aka. job). If the job is pending, it will be removed. If the job is running, it will be terminated.
译:取消spider(又名job)运行。如果作业处于待处理状态,则会将其删除。如果作业正在运行,它将被终止。

  • Supported Request Methods: POST
  • Parameters:
    • project (string, required) - the project name
    • job (string, required) - the job id

Example request:

$ curl http://localhost:6800/cancel.json -d project=myproject -d job=6487ec79947edab326d6db28a2d86511e8247444

Example response:

{"status": "ok", "prevstate": "running"}

5、listprojects.json

Get the list of projects uploaded to this Scrapy server.
译:获取上传到此Scrapy服务器的项目列表

  • Supported Request Methods: GET
  • Parameters: none

Example request:

$ curl http://localhost:6800/listprojects.json

Example response:

{"status": "ok", "projects": ["myproject", "otherproject"]}

6、listversions.json

Get the list of versions available for some project. The versions are returned in order, the last one is the currently used version.
译:获取已发布项目可用的版本列表。版本按顺序返回,最后一个版本是当前使用的版本。

  • Supported Request Methods: GET
  • Parameters:
    • project (string, required) - the project name

Example request:

$ curl http://localhost:6800/listversions.json?project=myproject

Example response:

{"status": "ok", "versions": ["r99", "r156"]}

7、listspiders.json

Get the list of spiders available in the last (unless overridden) version of some project.
译:获取某个项目的最后一个(除非被覆盖)版本中可用的spider列表。

  • Supported Request Methods: GET
  • Parameters:
    • project (string, required) - the project name
    • _version (string, optional) - the version of the project to examine

Example request:

$ curl http://localhost:6800/listspiders.json?project=myproject

Example response:

{"status": "ok", "spiders": ["spider1", "spider2", "spider3"]}

8、listjobs.json

New in version 0.15.

Get the list of pending, running and finished jobs of some project.
译:获取某个项目的待处理,正在运行和已完成的作业列表。

  • Supported Request Methods: GET
  • Parameters:
    • project (string, option) - restrict results to project name

Example request:

$ curl http://localhost:6800/listjobs.json?project=myproject | python -m json.tool

Example response:

{"status": "ok","pending": [{"project": "myproject", "spider": "spider1","id": "78391cc0fcaf11e1b0090800272a6d06"}],"running": [{"id": "422e608f9f28cef127b3d5ef93fe9399","project": "myproject", "spider": "spider2","start_time": "2012-09-12 10:14:03.594664"}],"finished": [{"id": "2f16646cfcaf11e1b0090800272a6d06","project": "myproject", "spider": "spider3","start_time": "2012-09-12 10:14:03.594664","end_time": "2012-09-12 10:24:03.594664"}]
}

Note
All job data is kept in memory and will be reset when the Scrapyd service is restarted. See issue 12.

9、delversion.json

Delete a project version. If there are no more versions available for a given project, that project will be deleted too.
译:删除项目的指定版本;如果给定项目没有更多可用版本,则该项目也将被删除。

  • Supported Request Methods: POST
  • Parameters:
    • project (string, required) - the project name
    • version (string, required) - the project version

Example request:

$ curl http://localhost:6800/delversion.json -d project=myproject -d version=r99

Example response:

{"status": "ok"}

10、delproject.json

Delete a project and all its uploaded versions.
译:删除项目及所有上传的版本

  • Supported Request Methods: POST
  • Parameters:
    • project (string, required) - the project name

Example request:

$ curl http://localhost:6800/delproject.json -d project=myproject

Example response:

{"status": "ok"}

Scrapyd API相关推荐

  1. Scrapy+Scrapy-redis+Scrapyd+Gerapy 分布式爬虫框架整合

    简介:给正在学习的小伙伴们分享一下自己的感悟,如有理解不正确的地方,望指出,感谢~ 首先介绍一下这个标题吧~ 1. Scrapy:是一个基于Twisted的异步IO框架,有了这个框架,我们就不需要等待 ...

  2. Linux部署Scrapyd及配置功能

    文章目录 版本介绍 1.Python3环境的安装 2.Scrapyd的安装 3.Scrapy配置文件 4.Scrapy启动 5.访问认证 6.Scrapyd-client的安装 7.Scrapyd A ...

  3. python网络爬虫开发从入门到精通pdf-Python 3网络爬虫开发实战PDF高清完整版下载...

    提取码:9lq0 目录  · · · · · · 第1章 开发环境配置 1 1.1 Python 3的安装 1 1.1.1 Windows下的安装 1 1.1.2 Linux下的安装 6 1.1.3 ...

  4. python爬虫软件-一些Python爬虫工具

    爬虫可以简单分为三步骤:请求数据.解析数据和存储数据 .主要的一些工具如下: 请求相关 request 一个阻塞式http请求库. Selenium Selenium是一个自动化测试工具,可以驱动浏览 ...

  5. Python编程:twine模块打包python项目上传pypi

    注册账号(重要) https://pypi.org 可以配置到$HOME/.pypirc文件中,就不用多次输入了 [pypi] username = <username> password ...

  6. Python爬虫:关于scrapy、Gerapy等爬虫相关框架和工具

    框架名称 作用 地址 scrapy 爬虫框架 https://github.com/scrapy/scrapy Scrapyd 部署启动.状态监控 https://github.com/scrapy/ ...

  7. Python-Scrapyd

    Scrapyd 是一个运行Scrapy爬虫的服务程序, 它提供一系列HTTP接口来帮我们部署,启动,停止,删除爬虫程序,利用它我们可以非常方便的完成Scapy爬虫项目的部署任务调度.而且它不会像pys ...

  8. Scrapyd部署Scrapy框架项目

    1. scrapyd的介绍 scrapyd是一个用于部署和运行scrapy爬虫的程序,它允许你通过JSON API(即:post请求的webapi)来部署爬虫项目和控制爬虫运行,scrapyd是一个守 ...

  9. 《Learning Scrapy》(中文版)第11章 Scrapyd分布式抓取和实时分析

    序言 第1章 Scrapy介绍 第2章 理解HTML和XPath 第3章 爬虫基础 第4章 从Scrapy到移动应用 第5章 快速构建爬虫 第6章 Scrapinghub部署 第7章 配置和管理 第8 ...

最新文章

  1. python链接mysql 判断是否成功_【初学python】使用python连接mysql数据查询结果并显示...
  2. MySQL 5.5单实例 编译安装
  3. 信息系统项目管理师为什么不建议自学
  4. android+4.4+jni闪退,native2.1 安卓退到后台时,概率闪退
  5. 1 Hadoop简介
  6. 又一个好友离开北京了.....
  7. sklearn 模型选择和评估
  8. 数组做函数参数的退化问题
  9. 史上最全高级Java教程总结版(强烈建议收藏)
  10. 量化研究 | 策略在指数与主连复权的差异化分析(二)
  11. java 线程的插队运行_java笔记--线程的插队行为
  12. 微服务开发中的数据架构设计 1
  13. 手机上可以拍蓝底证件照吗
  14. [论文阅读]ICE: Inter-instance Contrastive Encoding for Unsupervised Person Re-identification(ICCV2021)
  15. ur机械臂 控制器_UR机器人远程控制研究
  16. React(Js)学习
  17. python 限定数据范围_Python 生成周期性波动的数据 可指定数值范围2
  18. take android,Protake安卓版下载,Protake安卓版app v0.9 - 浏览器家园
  19. HBase的compact分析
  20. Web数据流的一生——从前端界面到后端数据,以及万级QPS的后台服务搭建

热门文章

  1. ESP mDNS 学习
  2. 二进制的加法减法运算
  3. iPhone/iPad的正确充电方法,有实验证明!
  4. 支付宝wap端支付php对接_支付宝WAP端的支付配置教程
  5. 我想写一个网络视频播放器
  6. HIT 计算机系统大作业 Hello程序人生P2P
  7. Nice!JavaScript基础语法知识都在这儿了
  8. Ubuntu 设置中文语言环境
  9. 【转】领域驱动设计之领域模型
  10. 浅谈药企如何选择药品流向系统