Internet P2P download technology

  • bittorrent
  • metalink

bittorrent

A P2P download protocal:
A communications protocol of peer-to-peer file sharing (“P2P”) which is used to distribute data and electronic files over the Internet. BitTorrent is one of the most common protocols for transferring large files.

  1. 通过服务器查询到拥有目标资源的节点
  2. 跟节点直接建立连接,下载节点拥有的资源片段。在自己完整下载一部分资源之后,为别人有下载需求节点传输资源。

其中拥有完整资源的节点叫 seed,拥有部分资源的节点叫 peer,更多定义在后文中介绍。

通过 .torrent 了解资源分布并下载资源。

Bram Cohen invent BitTorrent protocal with python in 2003.

Grow Up

The implementation of DHT(distributed hash table) eliminates the need for trackers.

Something about torrents.

  • Peer:

    someone who does not currently have the completed file.

When a peer is connected, he downloads the pieces he does not have and uploads the pieces he does have. You are a peer if you do not have a complete copy of the file you’re trying to get.

  • Seed:

    A Seed is someone who already has the complete file but is still sharing. If there are no Seeds, the only way to get a complete file is if all the pieces of the file can be found amongst the peers that are connected. In most cases, when there is no Seed, you probably won’t get the whole file.

Note that the term leech used to be common in bulletin boards and usenet groups. We used it to describe someone who downloads things but never uploads. With torrents, as soon as you get your first piece, you’re sharing. So we call everyone a peer. Of course, I still use leech when I talk about about people who never Seed a file. As a rule of thumb, you should always try to Seed a torrent before for at least 1 full copy. You can see this in the Ratio column. 1.000 or higher means you have seeded at least 1 full copy of the file.

  • Tracker:

    The tracker is a server that has all the info about the people that are down- and uploading the file.

The tracker itself does not have a copy of the file, it only tracks the people who have the file (seeds) and the people who have part of the file (peers). Torrents can be tied to a specific tracker, but most clients now support trackerless torrents, making it less likely that you will be hurt if you can’t find the
original tracker.

  • Scrape:

    When your BitTorrent client asks for info from the “tracker”, we call this scraping.

The data you get from scraping tells you how many Seeds and Peers there are for each torrent. This is not limited to just the active Seeds and Peers. It could also include Seeds and Peers that are not currently connected.

  • Swarm:

    Together, all the Seeds and Peers who are using the same torrent on the same tracker with you.

For example, six Peers and two Seeds on the same tracker make a swarm of eight. So your Swarm is NOT the users you are connected to. It’s perfectly normal NOT to connect to ALL seeds and peers in a swarm. In a minute, we’ll even see how the opposite is true.

First, when you look at the numbers in the Seeds and Peers columns, we see 2 numbers in each column: x(y).
  • SEEDS:

    x = the number of seeds from which your client is currently downloading pieces.

    y = the total number of seeds in the swarm

So, if you see 5 (14) under Seeds, you are connected to 5 out of 14 seeds. The tracker knows about 9 more seeds to which you are NOT connected. This could be because these seeds only allow a limited number of connections or there could be other reasons.

Once you get the complete file, you will no longer connect to Seeds because, as a Seed
yourself, you don’t exchange any data with other Seeds. Your client still shows you the Seeds in your swarm so you see something like 0(14).

If you see something like 12 (4), it usually means the tracker only knows about 4 seeds in your swarm, but thanks to a feature called DHT you were able to connect to seeds outside your swarm.

  • DHT:
    stands for Distributed hash tables. You don’t need to understand DHT. Just know that DHT makes allows trackers to share the burden of tracking swarms for torrents. If your client only sees the swarm attached to your tracker, if your client supports DHT, it might connect to Seeds and Peers that are connected to another tracker.

  • PEERS

    x = the number of peers with which you are currently sharing pieces (downloading or uploading)
    y = the total number of peers in the swarm

This works very much like Seeds.

5 (12) means you are connected to 5 peers but the tracker knows about 7 more peers to which you are NOT connected. 0 (12) means your client knows about 12 peers but you are not connected to any of them. If your file is not complete, this might mean that none of the Peers needs any of the pieces you have. 12 (4) means the tracker knows about 4 peers in your swarm, but DHT is helping you connect to seeds outside your swarm.

One last tidbit:

Trackers are Not Web Sites.
A tracker helps clients to connect to each other and collects data about your swarm. It is basically a “dumb pc” that only knows how to connect your BitTorrent client to other BitTorrent clients that are downloading the same torrent. An indexer is a website that hosts torrent files for download. So Piratebay is an indexer, but not a tracker.

  • Referrence

  • source code of bittorrent at cvs.sourceforge

Production:

  1. 迅雷,搭建有自己的中心资源服务器,所以对国内资源的下载是最好的。
  2. BitTorrent:批量下载处理,很好用的。
  3. μTorrent
  4. aria2: 可以下载包括 bittorrent, megnet, FTP, HTTP等等各种形式链接资源的软件,命令行运行,高度可定制,非常适合在生产环境中搭建自动化下载器。

Metalink

An Internet standard.

“Metalink is an extensible metadata file format that describes one or more computer files available for download. ”
“Besides FTP and HTTP mirror locations and rsync, it also supports listing the P2P methods BitTorrent, ed2k, magnet link or any other that uses a URI.”

Reading Materials:

  • metalink Home Page
  • The Metalink Download Description Format RFC 5854

发展中的技术

  • project-maelstrom
    可以把 Project Maelstrom 的一个网站看作一个Torrent,去中心化了,大大削弱服务器这个概念,把传统P2P只能传文件的概念进行了扩展,将网站作为文件进行传输,使用者越多读取的速度越快。

Introduction of internet P2P technology相关推荐

  1. Internet History, Technology, and Security----第三周

    Internet History, Technology, and Security----第三周 本周讲述互联网内部的发展状况. The Early World-Wide-Web Getting t ...

  2. Internet History, Technology and Security (Week5.1)

    Week5 The Transport layer is built on the Internetwork layer and is what makes our network connectio ...

  3. Coursera: Internet History, Technology, and Security

    课程网址:https://www.coursera.org/learn/internet-history 学习笔记: Week 1: History - Dawn of Early Computing ...

  4. Internet History, Technology, and Security(week5)——Technology: Internets and Packets

    前言: 之前都在学习Internet的历史,从这周开始,进入到了Internet技术的学习. Layer1: Link Introduction / The Link Layer 80年代之前,主流网 ...

  5. Internet History, Technology, and Security(week1)——History: Dawn of Electronic Computing

    前言: 第一次进行课程学习,在反复观看视频和查找字典翻译理解后选出了视频中个人认为较重要的概念,以下并不按照逐句翻译,中文概括大意余下自由发挥,对老师想要告诉我们的历史有一个初步的了解,顺便锻炼以下英 ...

  6. 1.1 Introduction (computer abstractions and technology)

    1.1 Introduction Welcome to this book! 欢迎来到这本书. We are delighted to have this apportunity to convey ...

  7. Internet History, Technology and Security (Week⑨)

    Week ⑨ We are now on the second to last week of the class and finishing up our look at Internet Secu ...

  8. Internet History, Technology, and Security(week2)——History: The First Internet - NSFNet

    前言: 上周学习了<电子计算机的曙光>,对战时及战后的计算机的历史发展有了更丰富的了解,今天继续coursera的课程,感觉已经有点适应了课程的节奏(除了经常有些奇奇怪怪的词汇看都看不懂@ ...

  9. Internet History, Technology, and Security----第一周

    这系列博客是我对Coursera上的"互联网的历史.技术与安全"这门课的学习总结,按照课程安排,本篇博客主要记录课程第一周的主要内容. 第一周开篇展示互联网是如何创建的,是谁创建的 ...

最新文章

  1. Kafka设计解析(二):Kafka High Availability (上)
  2. 《淘宝网开店 拍摄 修图 设计 装修 实战150招》一一1.17 如何选择合适的拍摄地点...
  3. 03-SpringMVC-获得用户请求数据
  4. Linux原始套接字学习总结
  5. Scapy 伪造网络数据包
  6. 计算机文化英文15版答案,15信高《计算机文化基础》期中考试题答案
  7. 841. Keys and Rooms 钥匙和房间
  8. AXURE在原型设计中的应用
  9. 【超100%解法】剑指 Offer 33. 二叉搜索树的后序遍历序列
  10. 设计模式之开放封闭原则
  11. LVM逻辑卷管理命令
  12. python可视化三维矩阵点
  13. java qq音乐接口 api,QQ音乐的各种相关API
  14. Win10安装应用或打开应用时提示“用户账户控制 为了对电脑进行保护,已经阻止此应用”
  15. Sigfox融资1.5亿欧元扩展LPWA网络
  16. 蚂蚁金服首席架构师:区块链技术如何促进数字普惠金融
  17. 创业企业如何定制商业模式:把握不同行业生命周期,9大要素集中进行创新【转】...
  18. 设计师必备!超好用的MAC电脑网页设计师软件
  19. 圣墟(圣墟最新章节,圣墟无弹窗全文阅读,圣墟无广告全文阅读)
  20. 论文阅读:Oriented RepPoints for Aerial Object Detection (CVPR 2022)

热门文章

  1. Dockerfile 的 CMD 与 ENTRYPOINT 傻傻分不清楚
  2. 2020Android开发常用的开源框架、开源库
  3. hz和分贝怎么转换_分贝转换
  4. 2-1课:万事的抽象:控制流程
  5. exit status 145: Ŀ¼���ǿյġ� exit s
  6. ReactNative之Android绝对布局position:'absolute'问题
  7. YOLO系列网络训练数据准备工具—Yolo_mark
  8. mysql_row百度百科_MySQL
  9. java中文乱码 例子_JSP中文乱码常见3个例子及其解决方法
  10. 服务器装系统快吗,云服务器安装系统 快吗