python爬虫实现hdu自动交题

python爬虫实现hdu自动AC机

苦逼大学生的编程之旅都是从hdu开始的，当学习被强制要求就开始无趣了起来，这个时候就得学会自己给自己找点乐子了，前几天刚开始学爬虫我就有个想法，是否可以整一个自动交题的代码，跟智慧树脚本一样自动答题，那样孩子就再也不用担心hdu题数太少被骂了

第一步肯定是学会从csdn中爬取相应的代码，利用正则和BeautifulSoup来提取出csdn中的代码

def search_code(url):         #输入代码网址,返回代码文本headers = {xxxxxxxx}request = urllib.request.Request(url, headers=headers)html = ""try:response = urllib.request.urlopen(request)html = response.read().decode('utf-8')except urllib.error.URLError as e:if hasattr(e, "code"):print(e.code)if hasattr(e, "reason"):print(e.reason)item=''soup = BeautifulSoup(html, 'html.parser')for item in soup.find_all('span', class_="cpp"):item=str(item)item= re.sub("<.*?>","",item)item = re.sub("&lt;", "<", item)item = re.sub("&gt;", ">", item)item = translate_code(item)if(item!=''):return itemfor item in soup.find_all('code'):item = str(item)item = re.sub("<.*?>", "", item)item = re.sub("&lt;", "<", item)item = re.sub("&gt;", ">", item)item = translate_code(item)return item

这只是得到一个网页上的代码，还需要引出多个网页的查找，目的是获取更多的关于题解的网址

https://so.csdn.net/so/search/s.do?q=hdu1100&t=blog&u=
可以发现其实只要变动q=后的就可以做到查询不同的题解


def search_answer(tihao):       #输入题号，给出代码列表url = 'https://so.csdn.net/so/search/all?q=hdu'+str(tihao)+'&t=all&p=1&s=0&tm=0&lv=-1&ft=0&l=&u='

这几天好像csdn页面改了，我之前的代码不对了只能改用webdriver来模拟

    driver = webdriver.Chrome('C:\Program Files\Google\Chrome\Application\chromedriver.exe')driver.get(url)time.sleep(3)html = driver.page_sourcelink = []soup = BeautifulSoup(html, 'html.parser')for item in soup.find_all('div',class_="list-item"):item = str(item)link1 = re.findall(findpic, item)  # 返回csdn网址if len(link1) >0:link.append(link1[0])

之前的代码

    url = 'https://so.csdn.net/so/search/s.do?t=all&s=&tm=&v=&l=&lv=&u=&q=hdu' + str(tihao)request = urllib.request.Request(url, headers=headers)html = ""try:response = urllib.request.urlopen(request)html = response.read().decode('utf-8')except urllib.error.URLError as e:if hasattr(e, "code"):print(e.code)if hasattr(e, "reason"):print(e.reason)link=[]soup = BeautifulSoup(html, 'html.parser')for item in soup.find_all('div', class_="container-list container-other-list active"):item = str(item)link = re.findall(findpic, item)              #返回csdn网址

最后利用一个for来实现提取代码去提交

    for i in range(0 , len(link)):             #遍历网址it = str(link[i])code = search_code(it)    #返回代码submit(tihao,code,i+1)if query_result(tihao):breakelse:print(str(tihao) + '失败了')time.sleep(random.randint(1, 3))if i>5:break

当然还有submit部分，我的理解就是先模拟登录

    session = requests.Session()session.post(url, data=data, headers=headers)

然后再向hdu提交代码

提交的网址就从chrome里抓包得到
发送的内容也可以从这里得到

    if(daima.find('import')!=-1):data = {'check': '0','problemid': str(tihao),'language': str(5),'usercode': daima}else:data = {'check': '0','problemid': str(tihao),'language': str(0),'usercode': daima}r = session.post(url, data=data, headers=headers)

提交完肯定就要去判断有无AC

def query_result(pid):url = 'http://acm.hdu.edu.cn/status.php?first=&pid=' + str(pid) + '&user=' + "yourID" + '&lang=0&status=0'headers = {xxxxxxxxxxxxxxxxxxx}html =  requests.get(url,headers = headers)pattern_query = r'<td><font color=red>(.*?)</font>'query_result = re.findall(pattern_query, html.text)if len(query_result) > 0:return Trueelse:return False

最后来个主控函数，程序就基本能跑了


def start():for i in range(1200, 1500):if query_result(i):print(str(i) + '已AC')else:search_answer(i)

代码写的异常丑陋，而且问题也特别多，但是勉强能跑，之后有空再来改进吧

python爬虫实现hdu自动交题相关推荐

HDU oj 自动交题爬虫
当我还在acm的时候就很想写这个爬虫了后来学了python 学了点网页请求方式然后就来写这个爬虫了为了记录自己学习的过程写了这一系列博客首先讲讲我的思路第一步当然是登陆和 cookie ...
HDU 自动刷题机 Auto AC （轻轻松松进入HDU首页）
前言: 在写这篇文章之前,首先感谢给我思路以及帮助过我的学长们以下4篇博客都是学长原创,其中有很多有用的,值得学习的东西,希望能够帮到大家! 1.手把手教你用C++ 写ACM自动刷题神器(冲入HDU ...
[Python爬虫] Selenium实现自动登录163邮箱和Locating Elements介绍
前三篇文章介绍了安装过程和通过Selenium实现访问Firefox浏览器并自动搜索"Eastmount"关键字及截图的功能.而这篇文章主要简单介绍如何实现自动登录163邮箱,同时 ...
Python爬虫 Selenium实现自动登录163邮箱和Locating Elements介绍
Python爬虫视频教程零基础小白到scrapy爬虫高手-轻松入门 https://item.taobao.com/item.htm?spm=a1z38n.10677092.0.0.482434a6E ...
Python爬虫，京东自动登录，在线抢购商品
京东抢购 Python爬虫,自动登录京东网站,查询商品库存,价格,显示购物车详情等. 可以指定抢购商品,自动购买下单,然后手动去京东付款就行. chang log 2017-03-30 实现二维码扫码 ...
快速入门！Python爬虫，京东自动登录，在线抢购商品！
话不多少,今天教大家如何用Python爬虫,自动登录京东网站,查询商品库存,价格,显示购物车详情等.可以指定抢购商品,自动购买下单,然后手动去京东付款就行. 运行环境 Python 2.7 第三方库 ...
【Python爬虫实战】codeforces刷题记录小助手
先看效果图. 输入codeforces的用户名,可以查询用户的rating信息.以及参加比赛的信息(大星参数的不计算在内).还有总的AC数. 一.需求分析找到显示用户参加contest信息的url. ...
python爬虫实例——session自动登录并爬取相关内容
1.理解下 session (会话) 所谓的会话,你可以理解成我们用浏览器上网,到关闭浏览器的这一过程.session是会话过程中,服务器用来记录特定用户会话的信息. 比如今天双11,你淘宝网浏览了哪 ...
Python爬虫之HDU提交数据
前一篇http://www.cnblogs.com/liyinggang/p/6094338.html 使用了爬虫爬取hdu 的代码,今天实现了将数据向hdu 提交的功能,接下来就是需要将两个功能合并 ...
用Python爬虫+Crontab实现自动更换电脑壁纸
概要系统:DeepinOS15.8桌面版编程语言&工具:Python3.5,requests,crontab 最近换壁纸的时候,纠结不知道换什么壁纸好.刚好前段时间从小伙伴那里了解到了Li ...

python爬虫实现hdu自动交题

python爬虫实现hdu自动AC机

python爬虫实现hdu自动交题相关推荐

最新文章

热门文章