Python代码实现中国日报网双语文章订阅至邮箱

import requests
from lxml import etree
import random
import smtplib
import requests
from email.mime.text import MIMEText
import re
import schedule
import time#请求连接获取文章内容
def deal_url(url):header = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'}proxyIPs = ['27.188.64.70','163.204.246.139','121.13.252.60','180.118.247.69','111.75.223.9','1.193.244.92']proxyIP = random.choice(proxyIPs)proxies = {'http': proxyIP,'https': proxyIP}html = requests.get(url,proxies).textreturn html#获取更新的链接
def geturls():url = 'https://language.chinadaily.com.cn/news_bilingual/'html = deal_url(url)alist = [i.start() for i in re.finditer('gy_box_txt2',html)]    #查找到首页中包含文章的链接起始位置da = []urllist = []for i in range(len(alist)):dd = html[alist[i]+68:alist[i]+150]     #截取链接字段if dd.count('"')>=2:da = [i.start() for i in re.finditer('"',dd)]   #定位到超链接引号数组位置dd = dd[da[0]+1:da[1]]      #提取两个引号中的字段urllist.append(dd)      #取到的链接存入列表return urllist#获取到正文和标题
def getContent():#随机获取一篇文章ran = random.randint(0,len(geturls())-1)url = 'https:'+geturls()[0]html = deal_url(url)html0 = etree.HTML(html)#分割出标题title0 = ''title = [i.start() for i in re.finditer('main_title1',html)]tit = html[title[0]+12:title[0]+100]a = tit.index('>')b = tit.index('<')title0 = tit[a+1:b]#分割出正文部分datas = ''for i in range(1,100,1):data = str(html0.xpath('//*[@id="Content"]/p['+str(i)+']/text()'))ss = '\xa0|\[\]|\"]|\[\"|\[\'|\'\]'     #正则匹配去除[''],[""]data = re.sub(ss,'',data)data = re.sub(r'\\xa0','',data)     #正则匹配去除\xa0data += '\n'*2if data.find('来源：')<=0:datas+=dataelse:breakreturn title0,datas#邮箱发送
def sendtoEmail():  msg = MIMEText(getContent()[1]) #发送正文部分msg["Subject"] = '【原视界文章推送】'+getContent()[0]    #发送标题部分msg["From"]    = usermsg["To"]      = totry:s = smtplib.SMTP_SSL("smtp.qq.com", 465)s.login(user, pwd)s.sendmail(user, to, msg.as_string())s.quit()print("发送成功!",time.ctime(time.time()))except smtplib.SMTPException as e: print ("发送出错,%s" %e)#设置定时发送
def send_mail_by_schedule():schedule.every().day.at("12:00").do(sendtoEmail)schedule.every().day.at("18:00").do(sendtoEmail)while True:schedule.run_pending()time.sleep(1)if __name__=="__main__":print('=======订阅系统=======')global user,pwd,touser = input('请输入QQ号:')+'@qq.com'pwd = input('请输入邮箱授权码:')to = input('请输入订阅号:')+'@qq.com'print('虫子管家已为您开启订阅，请勿退出此程序并保持网络通畅，订阅文章于每天12:00和18:00准时发送至您的邮箱！')send_mail_by_schedule()

Python代码实现中国日报网双语文章订阅至邮箱相关推荐

[Python爬虫案例]-中国古诗网
[Python爬虫案例]-中国古诗网看懂代码,你需要相关知识爬虫必备知识只是想得到目标的话,直接运行就好了 import requests import re import jsondef pa ...
Python爬取中国知网文献、参考文献、引证文献
转载自博客园文章作为学习资料,代码及相关介绍非常详细.原文链接见Python爬取中国知网文献.参考文献.引证文献
python爬取中国天气网天气图标
python爬取中国天气网天气图标准备工作天气预报图例网址:http://www.weather.com.cn/static/html/legend.shtml 安装requests:pip in ...
python实现微信hook_GitHub - gemgin/wechathook: 借助微信hook，拦截修改某些call，填充进我们的Python代码，进行微信公众号文章的爬取...
wechathook 借助微信hook,拦截修改某些call,填充进我们的Python代码,进行微信公众号文章的爬取注入器注入dll进程序中 DLL 实现hook功能,申请内存,修改call,在里 ...
python实现微信hook_GitHub - redtips/wechathook: 借助微信hook，拦截修改某些call，填充进我们的Python代码，进行微信公众号文章的爬取...
wechathook 借助微信hook,拦截修改某些call,填充进我们的Python代码,进行微信公众号文章的爬取注入器注入dll进程序中 DLL 实现hook功能,申请内存,修改call,在里 ...
python实现微信hook_GitHub - zhouxionger/wechathook: 借助微信hook，拦截修改某些call，填充进我们的Python代码，进行微信公众号文章的爬取...
wechathook 借助微信hook,拦截修改某些call,填充进我们的Python代码,进行微信公众号文章的爬取注入器注入dll进程序中 DLL 实现hook功能,申请内存,修改call,在里 ...
python hook微信_GitHub - 15993248973/wechathook: 借助微信hook，拦截修改某些call，填充进我们的Python代码，进行微信公众号文章的爬取...
wechathook 借助微信hook,拦截修改某些call,填充进我们的Python代码,进行微信公众号文章的爬取注入器注入dll进程序中 DLL 实现hook功能,申请内存,修改call,在里 ...
hook微信 python_GitHub - zkqiang/wechathook: 借助微信hook，拦截修改某些call，填充进我们的Python代码，进行微信公众号文章的爬取...
wechathook 借助微信hook,拦截修改某些call,填充进我们的Python代码,进行微信公众号文章的爬取注入器注入dll进程序中 DLL 实现hook功能,申请内存,修改call,在里 ...
贝壳采集器：中国日报网数据采集
中国日报网数据采集字段:新闻名称.日期.详细信息.图片功能:添加字段.修改列名.删除字段.深入采集.添加图片.下载数据本文就介绍了如何使用贝壳采集器采集中国日报网的基本操作步骤一.插件安 ...

Python代码实现中国日报网双语文章订阅至邮箱

Python代码实现中国日报网双语文章订阅至邮箱相关推荐

最新文章

热门文章