by William Bengtson | @__muscles

威廉·本格森 | @__肌肉

Two years ago a few colleagues (shoutout to helloarbit, travismcpeak, and coffeetocode) and I were talking about supply chain attacks which led to this work being completed. A supply chain attack is an attack that targets dependencies of a company in hopes that they can leverage a weakness in the supply chain to damage the target company. For a company that produces software, the supply chain attack typically targets the software that is used in development of the software product.

两年前,一些同事(对helloarbit , travismcpeak和coffeetocode的大喊大叫 )和我谈论的是导致该工作完成的供应链攻击。 供应链攻击是一种针对公司依赖关系的攻击,希望它们可以利用供应链中的弱点来破坏目标公司。 对于生产软件的公司,供应链攻击通常针对软件产品开发中使用的软件。

With this is mind, the software dependencies are packages/libraries in popular languages like Python, Java, Ruby, and Golang. Each language has their own way of packaging and retrieving dependencies which over the years has led to some having more proactive measures to protect against certain types of supply chain attacks than others.

考虑到这一点,软件依赖项是流行语言(如Python,Java,Ruby和Golang)中的软件包/库。 每种语言都有自己的打包和检索依赖关系的方式,多年来,这导致了某些语言比其他语言有更积极的措施来防御某些类型的供应链攻击。

打qua (Typosquatting)

I decided to take a look at what typosquatting in Python looks like and first started to look at Levenshtein to calculate the distance between two package names to determine if one thing is a typosquat of another. This can be useful to detecting a package that has been squatted and used as a tool to prevent packages from being squatted. With this in mind I wanted to see how many packages of the top installed Python packages can be squatted by removing underscores (_) or dashes (-).

我决定看一下Python中的域名抢注是什么样子,首先开始查看Levenshtein,以计算两个软件包名称之间的距离,以确定一件事是否是另一项域名的抢注。 这对于检测已被包裹的包裹并用作防止包裹被包裹的工具很有用。 考虑到这一点,我想看看可以通过删除下划线( _ )或破折号( - )来替换顶部安装的Python软件包中的多少个软件包。

数据 (Data)

One of the most fascinating pieces for me is that the Python Package Index (PyPI) makes data available about each Python package. This data can be very powerful when determining which packages are the most popular packages in the Python ecosystem. For more on analyzing PyPI downloads, checkout their guide here.

对我而言,最有趣的部分之一是Python包索引( PyPI )使每个Python包的数据可用。 当确定哪些软件包是Python生态系统中最受欢迎的软件包时,此数据可能非常强大。 有关分析PyPI下载的更多信息,请在此处查看其指南。

实作 (Implementation)

The goal of this exercise was to understand how many packages could be squatted and if possible prevent the future squat by registering them myself. Using the data mentioned above, I performed a query to tell me the top 10,000 installed Python packages using the package installer pip. Having a list of the top 10,000 pip installed packages allowed me to come up with a list of which packages could potentially be squatted by removing any_ or - from the package name as mentioned above.

本练习的目的是了解可以封装多少个软件包,并在可能的情况下通过自行注册来防止将来出现蹲袋。 使用上面提到的数据,我使用软件包安装程序pip进行了查询,以告诉我安装了10,000个最热门的Python软件包。 列出了安装的10,000个pip最高的软件包,这样我就可以列出通过从软件包名称中删除任何_-可能被蹲下的软件包列表。

I wanted to squat the potential packages so I needed to come up with a strategy for squatting them. There were a few options to choose from:

我想蹲下潜在的包裹,所以我需要提出一种将它们包裹的策略。 有一些选项可供选择:

  1. Clone the existing packages, change the name to the squatted name and register them.克隆现有软件包,将名称更改为蹲名并注册它们。
  2. Register the packages with a package that does nothing.用不执行任何操作的程序包注册程序包。
  3. Register the packages with a package that could potentially educate the person installing the squatted package.用可能会教育安装深蹲软件包的人员的软件包注册软件包。

Looking at the different options, I chose number three. The goal of the project has always been to be a “Guardian” and the other options seemed less than ideal in achieving the goal. Number two would have left developers wasting a lot of time trying to understand why their software doesn’t work if they installed the squatted package instead of the real package and number one seemed shady and could lead people to believe the intent behind this project had a malicious future.

考虑不同的选择,我选择了第三位。 该项目的目标始终是成为“守护者” ,而其他选择似乎并不理想。 第二名会让开发人员浪费大量时间试图理解为什么如果他们安装了蹲下的软件包而不是真正的软件包,为什么他们的软件不起作用,第一名似乎是阴暗的,并且可能使人们相信该项目背后的意图是恶意的未来。

Initially I decided to squat around 1,100 or so packages. In order to do this, I first needed to create the package to push to PyPI and then create some automation around this to make squatting this many packages achievable in a short amount of time.

最初,我决定蹲下大约1100个包裹。 为了做到这一点,我首先需要创建要推送到PyPI的软件包,然后围绕该软件包创建一些自动化措施,以使在很短的时间内即可压缩大量软件包。

I decided to create a simple package that when installed would fail and print out an error message letting you know the actual package name you probably meant to install. Below is an example of what happens when you try to install pythonjsonlogger instead of the real package python-json-logger.

我决定创建一个简单的软件包,该软件包在安装时将失败并打印一条错误消息,告知您可能要安装的实际软件包名称。 以下是当您尝试安装pythonjsonlogger而不是实际的包python-json-logger时发生的情况的示例。

pip install pythonjsonloggerCollecting pythonjsonlogger  Downloading pythonjsonlogger-0.1.1.tar.gz (1.3 kB)Building wheels for collected packages: pythonjsonlogger  Building wheel for pythonjsonlogger (setup.py) ... error  ERROR: Command errored out with exit status 1:...  Complete output (29 lines):  running bdist_wheel  running build  running build_py  creating build  creating build/lib....    File "/private/var/folders/cy/kc766fxx37b5rf87qxkt8hj00000gp/T/pip-install-kp4jab6f/pythonjsonlogger/setup.py", line 20, in run      raise Exception("You probably meant to install and run python-json-logger")  Exception: You probably meant to install and run python-json-logger  ----------------------------------------  ERROR: Failed building wheel for pythonjsonlogger

Now that I had the package setup and automation written, I kicked off the squatting and sat back. With over 1,100 packages registered, I now needed to wait to see if anyone would actually install these accidentally. What I found over the next two years is the most interesting piece of this project in my opinion.

现在,我已经写好了软件包的设置和自动化程序,我开始蹲下并坐下来。 现在已经注册了1,100多个软件包,现在我需要等待,看看是否有人真的会意外安装这些软件包。 在我看来,未来两年我发现的是该项目中最有趣的部分。

结果 (Results)

I originally meant to do a post on this after 6 months or a year, but here we are two years later and I have some data and stories to share. I’ll start with the data on 1,131 packages, and end with the stories.

我原本打算在6个月或一年后对此发表一篇文章,但是在这里,我们已经过了两年了,我有一些数据和故事可以分享。 我将从1131个数据包的数据开始,然后以故事结尾。

Top 10 squatted package downloads from July 16, 2018 until August 4, 2020:

从2018年7月16日到2020年8月4日排名前10位的蹲下软件包下载:

In a little over two years there have been 530,950 total pip install commands run on 1,131 packages! This does not include any mirrors or internal package registries that have cloned these packages privately. Malicious packages in PyPI have been know to steal credentials stored on the local file system such as SSH credentials in ~/.ssh/, GPG keys, or perhaps AWS credentials stored in ~/.aws/credentials. If these typosquat packages were written with malicious intent and we assume one attempt per install, that would mean 530,950 machines could have been compromised over the two year period.

在短短两年多的时间里,总共有530,950个pip install命令运行在1,131个软件包上! 这不包括已经私下克隆了这些软件包的任何镜像或内部软件包注册表。 已知PyPI中的恶意软件包会窃取存储在本地文件系统上的凭据,例如~/.ssh/ SSH凭据,GPG密钥,或者也许是存储在~/.aws/credentials AWS ~/.aws/credentials 。 如果这些错别字软件包是出于恶意目的编写的,并且我们假设每次安装都要进行一次尝试,那么这意味着530,950台计算机可能在两年内受到了威胁。

While the data is incredibly interesting and you can draw your own conclusions on what could have happened if these were malicious packages, the encounters/stories I find are the most interesting.

尽管数据令人难以置信,并且您可以得出自己的结论,如果这些都是恶意软件包可能会发生什么,但是我发现的遭遇/故事是最有趣的。

Over the two years I received the following encounters:

两年来,我遇到了以下问题:

  • Help installing my package帮助安装我的软件包
  • Thanks for protecting the Python community感谢您保护Python社区
  • Text from a friend asking if I owned a certain package朋友发来的短信,询问我是否拥有某个包裹
  • Researchers finding my work研究人员找到我的工作
  • Company asking to confirm the license on one of my squatted packages公司要求确认我一个蹲下的包裹的许可证

帮助安装我的软件包 (Help installing my package)

Over the course of the last two years I have received numerous emails asking for help installing my package or reporting that my package is broken. Each time I’d simply reply with the correct package they should install.

在过去的两年中,我收到了许多电子邮件,要求您帮助安装我的软件包或报告我的软件包已损坏。 每次我只要回答正确的软件包就应该安装它们。

感谢您保护Python社区 (Thanks for protecting the Python community)

A few times I received emails from people who have installed my squatted package accidentally and either read the error print out or saw my package registration clearly stating it is package to prevent exploit.

几次,我收到了意外安装我蹲下的程序包的人的电子邮件,或者阅读错误打印输出或者看到我的程序包注册清楚地表明它是程序包,以防止被利用。

In a few cases it was both a research finding my work and thanking me.

在少数情况下,这既是一项寻找我的工作并感谢我的研究。

I am working on a project for my security class related to attacks on the Python ecosystem. We kept stumbling upon your packages while trying to identify typo-squatting attempts.

我正在为我的安全类开发一个项目,该项目与对Python生态系统的攻击有关。 我们一直在绊倒您的包裹,同时尝试识别打字错误。

I just thought I would say thanks for helping out the community! :)

我只是想感谢您对社区的帮助! :)

朋友发来的短信,询问我是否拥有某个包裹 (Text from a friend asking if I owned a certain package)

I told a few friends about this work and I don’t quite remember how the interaction went down, but it was something along the lines of:

我向几个朋友介绍了这项工作,但我不太记得互动是如何发生的,但这与以下内容类似:

Friend: Hey! Do you own pythonjsonlogger?

朋友:嘿! 您是否拥有pythonjsonlogger?

Me: Yeah why?

我:是吗?

Friend: Dammit!

朋友:该死!

You know who you are :)

你知道你是谁 :)

研究人员找到我的工作 (Researchers finding my work)

I have really enjoyed each occurrence of a researcher finding my work because it typically involved a conversation around typosquatting. The example above ended up with my work being included in a symposium paper for the University of Maryland CMSC 8180 class called PYed PIPer by Josiah Wedgwood and Aadesh Bagmar.

我真的很喜欢研究人员每次都找到我的作品,因为它通常涉及围绕域名抢注的对话。 上面的示例最终使我的工作包含在由Josiah Wedgwood和Aadesh Bagmar主持的马里兰大学CMSC 8180班的研讨会论文中,名为PYed PIPer。

In the most recent case I met with the creator of pypi-scan, John Speed Myers, on Zoom for about an hour on our work and we talked about potential future collaborations in this area. Most importantly, this conversation got me excited about the topic again and prompted me to finally write this post as well as do another round of squatting 3,000+ more packages to continue to protect the Python ecosystem.

在最近的一次案例中,我在Zoom上与pypi-scan的创建者John Speed Myers会面了大约一个小时,并讨论了该领域未来的潜在合作。 最重要的是,这次对话让我再次对这个话题感到兴奋,并促使我最终写了这篇文章,并进行了另一轮蹲下3,000多个软件包来继续保护Python生态系统。

公司要求确认我一个蹲下的包裹的许可证 (Company asking to confirm the license on one of my squatted packages)

The one I find most scary is an email from a large company asking to confirm the license on one of my squatted packages for use.

我最害怕的一封邮件是某大公司发出的一封电子邮件,要求确认我一个蹲下的软件包的使用许可。

最后的想法 (Final thoughts)

All in all this was a fun project that I never meant to take two years to actually write about. Over the last year or so, I have discussed this work with the folks at PyPI and they will actually be taking ownership of my packages once I can confirm a final list of which packages are squatted versus the real packages I actually contribute to or own.

总而言之,这是一个有趣的项目,我从来没有想过要花两年的时间来写。 在过去的一年左右的时间里,我已经与PyPI的人员讨论了这项工作,一旦我确定了被盗用的软件包的最终清单与我实际贡献或拥有的真实软件包的清单,他们实际上将拥有我的软件包的所有权。

All language ecosystems are vulnerable to this type of attack with some being harder to achieve due to things like package namespace. This work was very targeted and did not expand into squatting future libraries for evolution of projects. Supply chain security is very difficult and has been challenging companies with large pockets for many many years.

所有语言生态系统都容易受到这种类型的攻击,由于软件包名称空间之类的原因,某些语言生态系统更难实现。 这项工作非常有针对性,没有扩展到为将来的项目开发而占用未来的库。 供应链安全非常困难,并且多年来一直在挑战拥有大量资金的公司。

Most importantly, THANKS to the folks at PyPI for what they do in making Python packages available to enable folks to develop each day!

最重要的是,感谢PyPI的人们在提供Python软件包以使人们每天发展方面所做的工作!

翻译自: https://medium.com/@williambengtson/python-typosquatting-for-fun-not-profit-99869579c35d


http://www.taodudu.cc/news/show-4450972.html

相关文章:

  • wc 一个进程结果是2_用开放的wc创建一个Web组件
  • web开发技术和技术分享_2020年将改变Web开发的顶级技术
  • 自定义事件
  • 您如何用leetcode进行面试很不好
  • 前后加编码_如何不加思考地编码?
  • angular技巧_提升Angular技能的5个技巧
  • 二年级计算机课,小学二年级信息技术课程教案三篇
  • Eclipse美观化代码
  • elementui表格如何自定义表头内容,让表头变得更美观
  • 强大的python中如何画出美观的散点图
  • 如何让word中代码更优雅美观【图解】【可微调】
  • MATLAB面板布局—便捷美观
  • 广东移动魔百盒M411A _905L3_线刷固件包
  • 黑龙江移动新魔百盒M411A_2+8_S905L3A_线刷固件包
  • The ES9038Q2M SABRE DAC
  • java 批量爬取国图 marc信息,用txt和excel保存
  • MSC Marc英文界面汉化
  • 2022年索尼A7R4A与A7R3A如何选择?
  • S/4 HANA标准表MARC增强字段
  • Msc.Marc安装和使用过程中遇到证书错误——处理办法
  • marc简单介绍
  • 计算机网络——应用层之万维网(WWW)
  • 通过z39.50协议用YAZ软件获取Marc数据(JAVA版)
  • 免费MARC软件
  • ABAP在Eclipse中做abap cds视图(marc表增强字段增强)
  • 使用python导出msc.marc后处理数据——PyPost介绍
  • 网页marc数据采集器(国图marc数据批量下载)
  • HL7 标准及实现指南 必看的网址
  • marc数据
  • Msc.Marc的python开发#1

有趣的python typosquatting不赚钱相关推荐

  1. python爬虫怎么赚钱-终于找到python爬虫怎么挣钱

    什么是Python,网络给出的解释是一种面向对象.解释型计算机程序设计语言.那python爬虫怎么赚钱?下面是小编为您整理的关于python爬虫怎么挣钱,希望对你有所帮助. python爬虫怎么挣钱 ...

  2. python爬虫怎么赚钱-python爬虫怎么赚钱

    python爬虫是什么意思 网络爬虫的工作原理 网络爬虫,即Web Spider,是一个很形象的名字.把互联网比喻成一个蜘蛛网,那么Spider就是在网上爬来爬去的蜘蛛.网络蜘蛛是通过网页的链接地址来 ...

  3. python turtle循环图案-有趣的Python turtle绘图

    原标题:有趣的Python turtle绘图 Python Turtle是Python的一个编程教育类库,越来越受到教育者的关注,近日,以"智能时代,逐梦成长"为主题的第5届全国青 ...

  4. python小项目推荐项目-推荐 10 个有趣的 Python 项目

    想成为一个优秀的开发者,没有捷径可走,势必要花费大量时间在键盘后. 而不断地进行各种小项目开发,可以为之后的大开发项目积攒经验,做好准备. 但不少人都在为开发什么项目而苦恼,因此,我为大家准备了10个 ...

  5. python项目-推荐 10 个有趣的 Python 练手项目

    想成为一个优秀的Python程序员,没有捷径可走,势必要花费大量时间在键盘后. 而不断地进行各种小项目开发,可以为之后的大开发项目积攒经验,做好准备. 但不少人都在为开发什么项目而苦恼. 因此,我为大 ...

  6. 自学python三个月能赚钱吗-自学Python三个月能赚钱吗?

    自学Python三个月能赚钱吗?如果有一定的计算机编程基础并有一定的开发经验,自学Python三个月具备相应的岗位技能是可以找到一份工作获得报酬,如果是零基础的小编自学2个月就想达到就业能力,通常来说 ...

  7. 学会python爬虫怎么赚钱-学会python爬虫怎么赚钱

    Python爬虫如此的神奇,那我们不禁要问关键的一点,用它怎么赚钱. 最典型的就是找爬虫外包活儿. 这个真是体力活,最早是在国外各个freelancer网站上找适合个人做的小项目,看见了就赶紧去bid ...

  8. python恶搞小程序-有趣的python小程序

    有趣的python小程序 1.密码生成器x=int(input()) print(''.join(__import__('random').choice('QWERTYUIOPASDFGHJKLZXC ...

  9. 推荐 10 个有趣的 Python 项目

    想成为一个优秀的开发者,没有捷径可走,势必要花费大量时间在键盘后. 而不断地进行各种小项目开发,可以为之后的大开发项目积攒经验,做好准备. 但不少人都在为开发什么项目而苦恼,因此,我为大家准备了10个 ...

最新文章

  1. 线程学习5——竞态条件
  2. Vue 教程第四篇—— Vue 实例化时基本属性
  3. 二叉树的前序中序后序 递归与非递归解法
  4. 一个白学家眼里的 WebAssembly
  5. Funcode-黄金矿工
  6. mysql书单推荐_MySQL零基础入门推荐书籍?
  7. 计算机专业考研 数学分析,(NEW)中山大学数据科学与计算机学院数学分析(A)历年考研真题汇编.pdf...
  8. qq飞车找不到服务器了,QQ飞车体验服务器专区
  9. 直通车点击软件测试自学,【图片】最给力直通车点击软件,防御直通车恶意点击秒杀软件,可测试效果_直通车吧_百度贴吧...
  10. mac 解压rar压缩文件
  11. 保护私密文件夹,可以这样设置隐藏起来
  12. Java 把jpg图片合成gif格式动态图片
  13. No qualifying bean of type ‘com.bruceliu.mapper.UserMapper‘
  14. BC26 TCP透传
  15. Unity-网络开发(二)
  16. EN 13226木地板带有凹槽与舌状连接的实木地板
  17. 着手去做吧,心中的想法总会是无限动力
  18. tp5.1 前台模板使用公共模板网页(header.html、foot.html、base.html)
  19. Idioms about music
  20. 基于Python flask 框架的微信支付 全代码

热门文章

  1. HTML小游戏16 —— 消除游戏《魔法石》源码(附完整源码)
  2. 各种activation function(激活函数) 简介
  3. 南京计算机软考考点,南京市区计算机软考哪个好
  4. html自动执行bat,bat脚本启动程序 怎么命令bat打开某个文件
  5. c语言统计英文字母频率,C语言实现英文文本词频统计
  6. 记时,耗时,Stopwatch
  7. 蓝奏云网盘在线上传源码
  8. python 闭包及个人理解
  9. 数学定理可以这样证明
  10. 大型网站架构 - LAMP