最近在从基础学习JS逆向,来分享一下百度翻译JS逆向的整个过程,也有助于自己加深记忆。

JS逆向可以说是爬虫工程师必备的知识点了,但是如果对前端知识不够了解还是学起来很有难度的。

想学习的话可以在B站找找JS逆向的课程

废话不多说,我们正式开始。

  1. 首先,我们找到进入百度翻译,https://fanyi.baidu.com/,然后F12抓包:

    因为是异步加载,所以需要抓XHR,找到数据接口链接:https://fanyi.baidu.com/v2transapi?from=en&to=zh,
  2. 对两次抓包数据的分析:


    可以看到,在form表单里其他数据都是不变的,只有sign这个参数是变化的,下面先写代码不带sign请求一下这个网址,看看能不能得到结果:
  3. 看来不带sign这个参数是拿不到数据的,下面我们就来破解一下这个参数:
    首先:全局搜索sign,找到对应的js文件

    分析这些js文件,发现第一个index文件中有很大可能出现,我们先分析index.js文件:

    ctrl+f搜索sign:找来找去找到这个函数,看起来跟form表单的数据非常像,可以看出,sign是f(n)这个函数生成的:

    我们先打个断点确认一下是不是这个sign:

    下面我们找到f(n)这个函数(可以直接点击f(n)找到):

    非常复杂对吧,如果用python代码来还原整个JS逻辑,那就太难了,所有接下来就需要用到我们的一个第三方库,叫:pyexecjs,这个库可以执行我们的js文件,直接pip install 就可以

创建一个js文件后将这段js代码放进去:

function e(r) {var o = r.match(/[\uD800-\uDBFF][\uDC00-\uDFFF]/g);if (null === o) {var t = r.length;t > 30 && (r = "" + r.substr(0, 10) + r.substr(Math.floor(t / 2) - 5, 10) + r.substr(-10, 10))} else {for (var e = r.split(/[\uD800-\uDBFF][\uDC00-\uDFFF]/), C = 0, h = e.length, f = []; h > C; C++)"" !== e[C] && f.push.apply(f, a(e[C].split(""))),C !== h - 1 && f.push(o[C]);var g = f.length;g > 30 && (r = f.slice(0, 10).join("") + f.slice(Math.floor(g / 2) - 5, Math.floor(g / 2) + 5).join("") + f.slice(-10).join(""))}var u = void 0, l = "" + String.fromCharCode(103) + String.fromCharCode(116) + String.fromCharCode(107);u = null !== i ? i : (i = window[l] || "") || "";for (var d = u.split("."), m = Number(d[0]) || 0, s = Number(d[1]) || 0, S = [], c = 0, v = 0; v < r.length; v++) {var A = r.charCodeAt(v);128 > A ? S[c++] = A : (2048 > A ? S[c++] = A >> 6 | 192 : (55296 === (64512 & A) && v + 1 < r.length && 56320 === (64512 & r.charCodeAt(v + 1)) ? (A = 65536 + ((1023 & A) << 10) + (1023 & r.charCodeAt(++v)),S[c++] = A >> 18 | 240,S[c++] = A >> 12 & 63 | 128) : S[c++] = A >> 12 | 224,S[c++] = A >> 6 & 63 | 128),S[c++] = 63 & A | 128)}for (var p = m, F = "" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(97) + ("" + String.fromCharCode(94) + String.fromCharCode(43) + String.fromCharCode(54)), D = "" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(51) + ("" + String.fromCharCode(94) + String.fromCharCode(43) + String.fromCharCode(98)) + ("" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(102)), b = 0; b < S.length; b++)p += S[b],p = n(p, F);return p = n(p, D),p ^= s,0 > p && (p = (2147483647 & p) + 2147483648),p %= 1e6,p.toString() + "." + (p ^ m)}

这是我没用execjs库用requests直接请求的代码:

import requestsurl = "https://fanyi.baidu.com/v2transapi?from=en&to=zh"def request(word):headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36","Cookie": "BIDUPSID=C1D60FBAFF280D92DEA75430FB219DFC; PSTM=1605330633; BAIDUID=C1D60FBAFF280D92A9F88C3BF0F07F3A:FG=1; REALTIME_TRANS_SWITCH=1; FANYI_WORD_SWITCH=1; HISTORY_SWITCH=1; SOUND_SPD_SWITCH=1; SOUND_PREFER_SWITCH=1; BDUSS=pWODkzTnZSczZUT2JUTWhpbUs0bWJkTFJ2SVZyMmZGa0VQbDBJdGo5VDE4RFZnRVFBQUFBJCQAAAAAAAAAAAEAAAAsVqassszV8dfmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAPVjDmD1Yw5ga1; BDUSS_BFESS=pWODkzTnZSczZUT2JUTWhpbUs0bWJkTFJ2SVZyMmZGa0VQbDBJdGo5VDE4RFZnRVFBQUFBJCQAAAAAAAAAAAEAAAAsVqassszV8dfmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAPVjDmD1Yw5ga1; BAIDUID_BFESS=E5C05BC9E40BD15758E69386EDD517FD:FG=1; BDORZ=B490B5EBF6F3CD402E515D22BCDA1598; __yjs_duid=1_df897e8d50f3f0e9d8eedf62c5c43d211614259579733; Hm_lvt_64ecd82404c51e03dc91cb9e8c025574=1614345635,1614345643,1614347940,1614401883; Hm_lpvt_64ecd82404c51e03dc91cb9e8c025574=1614401883; ab_sr=1.0.0_NGQ3NjQ4NTAzNGNjNDFhNTQ5ZDVmMDNlYTc1YTQyNmJlM2U1NjI3N2RmZGUyYjc2ZGNiZTUxOWQxNTBmZGYwMzQxNDBmNWU0NTQ3MDg5ZTVhODJhNjg5ZTQ0NGE5MmUx; __yjsv5_shitong=1.0_7_bb0b3e2f7db4e875eaf788202e0f3d2218c5_300_1614401882856_27.226.159.239_63467023"}form_data = {"from": "en" if word[0] in ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k","l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y","z", ] else "zh","to": "zh" or "en","query": word,"transtype": "realtime","simple_means_flag": 3,# "sign": sign,"token": "9a20246ac075f19baf61fd6ea99bd648","domain": "common"}response = requests.post(url, headers=headers, data=form_data)print(response.json())request("cat")

下面我们就来引入这个库,让这个第三方库去执行js代码,看看返回了什么结果:

import execjsdef get_sign(word):with open("demo01.js", "r", encoding="utf8") as f:jscode = f.read()"""complie括号里的参数是读取的js文件中的代码call方法中的参数:第一个是js这个函数的函数名,第二个参数是js函数的参数"""sign = execjs.compile(jscode).call("e", word)return sign

这是完整的代码:

import requests
import execjsurl = "https://fanyi.baidu.com/v2transapi?from=en&to=zh"def get_sign(word):with open("demo01.js", "r", encoding="utf8") as f:jscode = f.read()sign = execjs.compile(jscode).call("e", word)return signdef request(word):sign = get_sign(word)headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36","Cookie": "BIDUPSID=C1D60FBAFF280D92DEA75430FB219DFC; PSTM=1605330633; BAIDUID=C1D60FBAFF280D92A9F88C3BF0F07F3A:FG=1; REALTIME_TRANS_SWITCH=1; FANYI_WORD_SWITCH=1; HISTORY_SWITCH=1; SOUND_SPD_SWITCH=1; SOUND_PREFER_SWITCH=1; BDUSS=pWODkzTnZSczZUT2JUTWhpbUs0bWJkTFJ2SVZyMmZGa0VQbDBJdGo5VDE4RFZnRVFBQUFBJCQAAAAAAAAAAAEAAAAsVqassszV8dfmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAPVjDmD1Yw5ga1; BDUSS_BFESS=pWODkzTnZSczZUT2JUTWhpbUs0bWJkTFJ2SVZyMmZGa0VQbDBJdGo5VDE4RFZnRVFBQUFBJCQAAAAAAAAAAAEAAAAsVqassszV8dfmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAPVjDmD1Yw5ga1; BAIDUID_BFESS=E5C05BC9E40BD15758E69386EDD517FD:FG=1; BDORZ=B490B5EBF6F3CD402E515D22BCDA1598; __yjs_duid=1_df897e8d50f3f0e9d8eedf62c5c43d211614259579733; Hm_lvt_64ecd82404c51e03dc91cb9e8c025574=1614345635,1614345643,1614347940,1614401883; Hm_lpvt_64ecd82404c51e03dc91cb9e8c025574=1614401883; ab_sr=1.0.0_NGQ3NjQ4NTAzNGNjNDFhNTQ5ZDVmMDNlYTc1YTQyNmJlM2U1NjI3N2RmZGUyYjc2ZGNiZTUxOWQxNTBmZGYwMzQxNDBmNWU0NTQ3MDg5ZTVhODJhNjg5ZTQ0NGE5MmUx; __yjsv5_shitong=1.0_7_bb0b3e2f7db4e875eaf788202e0f3d2218c5_300_1614401882856_27.226.159.239_63467023"}form_data = {"from": "en" if word[0] in ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k","l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y","z", ] else "zh","to": "zh" or "en","query": word,"transtype": "realtime","simple_means_flag": 3,"sign": sign,"token": "9a20246ac075f19baf61fd6ea99bd648","domain": "common"}response = requests.post(url, headers=headers, data=form_data)print(response.json())request("cat")

我们再来请求一下:可以发现报错了!i参数未定义

我们来找找i在哪,会到js文件中,可以发现,i就在function e®这个函数中,继续打断点调试:

可以发现i和u是一个判断逻辑,我们在console控制台打印一下i的值:

这样我们就找到了i,然后在function e®中加入i,var i = “320305.131321201”

再次运行:

发现又报错了,缺少对象,那缺少什么对象呢?这儿就比较难分析了,我们从这个函数的返回值来分析,return p = n(p, D),可以看到这个n也是一个函数,但是这n并没有出现在我们的js文件里面,所以我们找到这个n函数,发现就在我们function e®上面,我们把这个函数复制下来,加到js文件里面

再次运行,就发现得到了返回的结果:

然后我们优化一下返回值代码,得到最终结果:

我把最终的两个文件都发一下吧:

demo01.py

import requests
import execjsurl = "https://fanyi.baidu.com/v2transapi?from=en&to=zh"def get_sign(word):with open("demo01.js", "r", encoding="utf8") as f:jscode = f.read()sign = execjs.compile(jscode).call("e", word)return signdef request(word):sign = get_sign(word)headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36","Cookie": "BIDUPSID=C1D60FBAFF280D92DEA75430FB219DFC; PSTM=1605330633; BAIDUID=C1D60FBAFF280D92A9F88C3BF0F07F3A:FG=1; REALTIME_TRANS_SWITCH=1; FANYI_WORD_SWITCH=1; HISTORY_SWITCH=1; SOUND_SPD_SWITCH=1; SOUND_PREFER_SWITCH=1; BDUSS=pWODkzTnZSczZUT2JUTWhpbUs0bWJkTFJ2SVZyMmZGa0VQbDBJdGo5VDE4RFZnRVFBQUFBJCQAAAAAAAAAAAEAAAAsVqassszV8dfmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAPVjDmD1Yw5ga1; BDUSS_BFESS=pWODkzTnZSczZUT2JUTWhpbUs0bWJkTFJ2SVZyMmZGa0VQbDBJdGo5VDE4RFZnRVFBQUFBJCQAAAAAAAAAAAEAAAAsVqassszV8dfmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAPVjDmD1Yw5ga1; BAIDUID_BFESS=E5C05BC9E40BD15758E69386EDD517FD:FG=1; BDORZ=B490B5EBF6F3CD402E515D22BCDA1598; __yjs_duid=1_df897e8d50f3f0e9d8eedf62c5c43d211614259579733; Hm_lvt_64ecd82404c51e03dc91cb9e8c025574=1614345635,1614345643,1614347940,1614401883; Hm_lpvt_64ecd82404c51e03dc91cb9e8c025574=1614401883; ab_sr=1.0.0_NGQ3NjQ4NTAzNGNjNDFhNTQ5ZDVmMDNlYTc1YTQyNmJlM2U1NjI3N2RmZGUyYjc2ZGNiZTUxOWQxNTBmZGYwMzQxNDBmNWU0NTQ3MDg5ZTVhODJhNjg5ZTQ0NGE5MmUx; __yjsv5_shitong=1.0_7_bb0b3e2f7db4e875eaf788202e0f3d2218c5_300_1614401882856_27.226.159.239_63467023"}form_data = {"from": "en" if word[0] in ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k","l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y","z", ] else "zh","to": "zh" or "en","query": word,"transtype": "realtime","simple_means_flag": 3,"sign": sign,"token": "9a20246ac075f19baf61fd6ea99bd648","domain": "common"}response = requests.post(url, headers=headers, data=form_data)print(response.json()["trans_result"]["data"][0])request("cat")

demo01.js

function n(r, o) {for (var t = 0; t < o.length - 2; t += 3) {var a = o.charAt(t + 2);a = a >= "a" ? a.charCodeAt(0) - 87 : Number(a),a = "+" === o.charAt(t + 1) ? r >>> a : r << a,r = "+" === o.charAt(t) ? r + a & 4294967295 : r ^ a}return r
}function e(r) {var i = "320305.131321201"var o = r.match(/[\uD800-\uDBFF][\uDC00-\uDFFF]/g);if (null === o) {var t = r.length;t > 30 && (r = "" + r.substr(0, 10) + r.substr(Math.floor(t / 2) - 5, 10) + r.substr(-10, 10))} else {for (var e = r.split(/[\uD800-\uDBFF][\uDC00-\uDFFF]/), C = 0, h = e.length, f = []; h > C; C++)"" !== e[C] && f.push.apply(f, a(e[C].split(""))),C !== h - 1 && f.push(o[C]);var g = f.length;g > 30 && (r = f.slice(0, 10).join("") + f.slice(Math.floor(g / 2) - 5, Math.floor(g / 2) + 5).join("") + f.slice(-10).join(""))}var u = void 0, l = "" + String.fromCharCode(103) + String.fromCharCode(116) + String.fromCharCode(107);u = null !== i ? i : (i = window[l] || "") || "";for (var d = u.split("."), m = Number(d[0]) || 0, s = Number(d[1]) || 0, S = [], c = 0, v = 0; v < r.length; v++) {var A = r.charCodeAt(v);128 > A ? S[c++] = A : (2048 > A ? S[c++] = A >> 6 | 192 : (55296 === (64512 & A) && v + 1 < r.length && 56320 === (64512 & r.charCodeAt(v + 1)) ? (A = 65536 + ((1023 & A) << 10) + (1023 & r.charCodeAt(++v)),S[c++] = A >> 18 | 240,S[c++] = A >> 12 & 63 | 128) : S[c++] = A >> 12 | 224,S[c++] = A >> 6 & 63 | 128),S[c++] = 63 & A | 128)}for (var p = m, F = "" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(97) + ("" + String.fromCharCode(94) + String.fromCharCode(43) + String.fromCharCode(54)), D = "" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(51) + ("" + String.fromCharCode(94) + String.fromCharCode(43) + String.fromCharCode(98)) + ("" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(102)), b = 0; b < S.length; b++)p += S[b],p = n(p, F);return p = n(p, D),p ^= s,0 > p && (p = (2147483647 & p) + 2147483648),p %= 1e6,p.toString() + "." + (p ^ m)
}

百度翻译这个JS的破解还是比较简单的,因为在form表单里只有一个sign参数在发生变化,其他都是固定不变的,后面我会分享更加难破解的加密数据,也欢迎大家关注的微信公众号:一起学习,共同进步!

python爬虫JS逆向加密破解之百度翻译相关推荐

  1. python爬虫js逆向加密,Web爬虫处理参数js加密、js混淆、js逆向

    中国空气质量在线监测平台(https://www.aqistudy.cn/html/city_detail.html)在众多的练习中,关闭了前台数据信息的展示,也就是说现在网页是这样的: 但我们主要学 ...

  2. **超防 ja3+加速乐(三种加密(md5,sha1,sha256)) 实战(python爬虫js逆向)

    **超防 加速乐+ja3 实战(python爬虫js逆向) 地址 aHR0cHM6Ly93d3cuaGVmZWkuZ292LmNuL2NvbnRlbnQvY29sdW1uLzY3OTQ4MTE/cGF ...

  3. 【JS 逆向百例】百度翻译接口参数逆向

    文章目录 逆向目标 逆向过程 抓包分析 获取 token 获取 sign 完整代码 baidu_encrypt.js baidufanyi.py 逆向目标 目标:百度翻译接口参数 主页:https:/ ...

  4. 腾讯爬虫python_【Python爬虫+js逆向】Python爬取腾讯漫画!

    前一段假期期间,博主已经自学完了Python反爬虫的相关内容,面对各大网站的反爬机制也都有了一战之力.可惜因实战经验不足,所以总体来说还是一个字--菜.前两天,在学习并实战爬取了博主最爱看的腾讯动漫后 ...

  5. python爬虫--URL部分加密破解

    URL部分加密破解 背景说明 示例及分析 解决方案 注意 背景说明 最近在爬取网站数据时,碰到提取到的a标签的url与真实的url地址不同,将a标签中的部分内容进行了加密处理,再拼接处理,拿到真实的u ...

  6. python爬虫JS逆向之人口流动态势大数据

    今天学习JS解密的网站是:人口流动态势大数据,这是网址:https://unicom_trip.133.cn/city/?system=cjfcts. 在做JS解密的时候,我们一定要记住一个方法:缺什 ...

  7. python爬虫JS逆向:X咕视频密码与指纹加密分析

    前言 本文的文字及图片来源于网络,仅供学习.交流使用,不具有任何商业用途,版权归原作者所有,如有问题请及时联系我们以作处理. 作者:煌金的咸鱼 PS:如有需要Python学习资料的小伙伴可以加点击下方 ...

  8. Python爬虫JS逆向X-Bogus,signature加密算法,AST 理论篇

  9. python爬虫之逆向破解_js逆向爬虫实战(2)--新快之加密参数破解

    爬虫js逆向系列 我会把做爬虫过程中,遇到的所有js逆向的问题分类展示出来,以现象,解决思路,以及代码实现.我觉得做技术分享,不仅仅是要记录问题,解决办法,更重要的是要提供解决问题的思路.怎么突破的, ...

最新文章

  1. OpenCV基础知识入门
  2. window.open html打开一个新页面
  3. RMQ问题(线段树算法,ST算法优化)
  4. C#调用dll中的函数
  5. oracle rownum 学习
  6. 在SAP CRM呼叫中心的搜索结果点击Edit按钮后的处理逻辑
  7. ubuntu下IP、DNS配置
  8. 四大基本反应类型的关系_如何进入四大的咨询部门?
  9. 如何利用systrace分析Android App的死锁问题
  10. TI AM335x Linux MUX hacking
  11. 如何封装一个自己的win7系统并安装到电脑做成双系统
  12. 查看本地计算机ip命令,如何用DOS命令查看自己的IP地址
  13. FatFs- 通用FAT文件系统模块
  14. Linux进程中的RSS和VSZ
  15. 使用Excel连接WINCC生成报表的实现方法
  16. Cloud Storage
  17. SQL SERVER 去掉字符串左边的0
  18. Revit二次开发之通过命令ID调用Revit自有命令
  19. Apad Qzone项目总结(一)---发布!!!
  20. 如何才能快速成为一名Java架构师?

热门文章

  1. Ubuntu首页挂预告,预计今晚12点推出平板Ubuntu系统
  2. 感谢大家对《软件性能测试与Loadrunner实战》的支持
  3. oracle的in集合,oracle 查询in操作 查询结果按in集合顺序显示
  4. 2020最新android教程,Android教程2020
  5. java 6789的10000次方,用MSSQL计算2的10000次方
  6. cisco显示ip地址_cisco视频会议,会议室两台电视、一个投影线路如何连接布线
  7. 企业级生产环境CICD入门
  8. 哪种语言更适合做自动化测试?
  9. 抓包,反抓包,反反抓包
  10. 你不知道的接口测试之简单的开始