今天介绍一下将在Monoxide公链系统中采用的区块压缩算法。从比特币的BIP152出发的进一步改进算法。使用Txilm 算法将使的区块传输带宽需求降低到原来的1/80,同时使得冲突概率控制在1/1000左右,消解冲突的计算量可以忽略不计。这个算法不依赖各个全节点之间的mempool是高度一致的,无需额外的协议保持mempool的同步。

Txilm Protocol: Lossy Block Compression with Salted Short Hashing

Background

Compact blocks carries only TXIDs by assuming most TXes in a newly created block are already stored in the mempool for most full nodes. This is proposed in BIP152. More details can be found inbitcoincore.org/en/2016

Compact blocks basically replace each TX (300–400 bytes roughly) with a 32-bytes TXID (e.g. SHA256 of the TX). This yields a nearly 10 times bandwidth saving.

We aim to further reduce the size of each TX representation in a compact block to around 40 bits. We can achieve 6.4x compression over the original proposal of Compact blocks (6.4 = 32 bytes / 40 bits). The new proposal is simply without using additional data structures like bloom filter or IBLT. Also, the new proposal doesn’t rely on consistent mempool across full nodes.

Combined with Canonical transaction ordering rule (CTOR), the short hashing can be further reduced to 32 bits, which yields 8x compression over the original proposal of Compact blocks. A totally 80 times data size reduction is realized.

Rationale

We present each TX in a block by a small hash value based on TXID:

TXID-HASH = h(TXID)

In which h is a small hash function that generates 32-bit to 64-bit hash values. It can be just CRC32, CRC40 or CRC64. The proposed new compact block scheme includes only a list of TXID_HASH, ordered as the original list of TX.

Ambiguity may occurs with such a k-bit small-sized hash, which needs to be resolved by each full node. Once receiving a new block that includes the TXID-HASH list from the sender, the receiver searches each received TXID-HASH in the hash list produced by its mempool. For each TXID_HASH, three cases may happen as follows:

  1. Not found: There is no TX in the mempool that matches the received TXID_HASH. The receiver will ask the sender or other peers for the TXID.

  2. A single match is found: the TXID is resolved.

  3. Multiple matches are found: the receiver will collect all matched TXIDs as candidates for a 2nd-stage resolving.

In the 2nd-stage, all combinations of candidates of multiple TXID-HASH are iterated for recomputing the Merkle tree. A correct combination will result in a matched Merkle root with the one carried by the block header.

If any of the combinations in (3) or the resolved TXID list in (2) can not match the Merkle root in the block header, the receiver will fall back to ask the sender to transfer the complete TXID list of the block, which is described in the original Compact blocks proposal. The cause of this situation may happen is that at least one TXID in the receiver mempool has the same TXID-HASH in the received TXID-HASH list, while this is not the one included in the block.

An optional optimization to the 2nd-stage search is to add a lightweight pre-check before actually recomputing the Merkle root. We propose a lightweight Merkle tree by replacing SHA256 with CRC32, the CRC32-Merkle tree, with a 4-byte root. When creating a new block, the 4-byte CRC32-Merkle root will prepended to the encoded TXID-HASH data. This yields 40x acceleration in searching the right combination. (SHA256 over 16 bytes v.s. CRC32 over 8 bytes)

Resolving ambiguity may incur additional latency. Iterating the combination of many ambiguous TXID_HASH may also consume a considerable amount of CPU time. The feasibility of this proposal highly depends on the probability of hash collision, which is related to the length of the hash value (the k-bit) and also the size of mempool.

Probability of a Single Collision

A single collision is defined as any case of the 1 or 3 occurred at least once. Such collision can be within the TXID-HASH list received, or ones between the received list and the mempool.

Given a TXID-HASH with k bits, the collision rate is 1/2^k. So the probability of a single collision occurred in a new block can be formulated as Generalized Birthday problem. Say, we have total mTX in the mempool in average, and the new block carries n TX. The probability of a single collision can be approximated as:

PSC = 1 - (1 - 1/2^k)^( m*n + n*n/2 ).

For example, we have m = 60000, n = 10000:

  • k = 32, PSC = 0.14

  • k = 40, PSC = 0.00059

  • k = 48, PSC = 0.0000023

We recommends k = 40 as a reasonable value with good compression ratio and pretty low collision rate. A sufficient k is roughly proportional to log( m*n + n*n/2 ).

For example, we enlarge to 100x: m = 6000000, n = 1000000:

  • k = 48, PSC = 0.023

  • k = 56, PSC = 0.000090

  • k = 64, PSC = 0.00000035

Or, we reduce to something much smaller: m = 3000, n = 200:

  • k = 24, PSC = 0.036

  • k = 32, PSC = 0.00014

  • k = 40, PSC = 0.00000056

Combined with CTOR

Canonical transaction ordering rule (CTOR) is an ordering scheme to sort transactions in a block and in mempools according to the hash of transaction, which is already deployed in BCH network. Based on CTOR, the proposed scheme will achieve much lower collision probability and/or much higher compression ratio.

Since transactions are ordered in both blocks and mempools, the candidate space of any ambiguous TXID-HASH will be narrowed to a range bound by its previous and next TXIDwith resolved TXID-HASH, instead of the entire mempool. Assuming the newly confirmed TXID are evenly distributed in the sorted mempool, the size of the potential collision space will be reduced from m to m/n . The collision within block only occurs if the ambiguous TXID-HASHare adjacent after ordering. This dramatically reduces the collision probability and the cost of resolving ambiguity, which allows even shorter hash with higher compression ratio.

With transaction ordering by CTOR, the collision probability can be approximated as:

PSC = 1 - (1 - 1/2^k)^( m + n/2 )

For small blocks: m = 3000, n = 200:

  • k = 16, PSC = 0.046

  • k = 24, PSC = 0.00018

  • k = 32, PSC = 0.00000072

For medium blocks: m = 60000, n = 10000:

  • k = 16, PSC = 0.63

  • k = 24, PSC = 0.0039

  • k = 32, PSC = 0.000015

For large blocks: m = 6000000, n = 1000000:

  • k = 24, PSC = 0.32

  • k = 32, PSC = 0.0015

  • k = 40, PSC = 0.0000059

Collision Attack

It is easy to construct a new transaction with its TXID-HASH matches with that of an existing transaction. Massively creation of such malicious transactions for collision may invalidate the collision probability analysis stated above and makes verification of new blocks costly, which eventually results in higher fork rate. We propose two simple strategies to address the attack model.

Short Hashing with Salt A simple strategy for defense is to introduce a salt when calculate the TXID-HASH:

TXID-HASH = h( Salt + TXID )

The salt should be specific to the block carrying those TXID-HASHes and included in the encoded data. For example, just take the CRC32-Merkle as the Salt, or introduce another 4-byte field with random bits.

By introducing Salt: The attacker cannot construct malicious transactions even that existing transactions are known to all, until a new block carrying them is out. Malicious transactions are unlikely reach full nodes earlier than the new block is received and verified. It requires the attacker can flood malicious transactions to the entire network very fast. Theoretically, it is possible (e.g. by controlling a large botnet) especially the underlying P2P network is small. The strategy makes malicious transaction specific to a single block, thus malicious transactions spreaded previously will not be able to attack future block creation and makes attack much inefficient economically.

Fall back when Under Attack Introducing Salt dramatically raised the cost of collision attack, while in the extreme case, massive collision attack is still possible, regardless of the cost. We require miners fall back to TXID list when the entire network is under attack. It is also incentivized because creating orphan blocks is wasting hash rate.

Such attack can be observed by all full nodes including miners when verifying incoming new blocks. It can be done by simply counting the number of ambiguous TXID-HASH per-block. If the counts significantly higher the expected value and forks are observed. The next block should fall back to TXID list.

P.S. 你是高手看到这里了,本专栏诚邀中文技术翻译的小伙伴 ~~ 有兴趣的联系我哦 ! 
P.S II. 如果有错误,请不吝指正 ~~

Reference

Compact Blocks FAQ

https://bitcoincore.org/en/2016/06/07/compact-blocks-faq/

BIP152: Compact Blocks

https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki

Canonical Transaction Ordering for Bitcoin

https://blog.vermorel.com/pdf/canonical-tx-ordering-2018-06-12.pdf

Wiki: Birthday problem

https://en.wikipedia.org/wiki/Birthday_problem#Cast_as_a_collision_problem

Big Number Calculation

https://www.ttmath.org/online_calculator

Txilm Protocol: Monoxide公链系统中的区块压缩算法相关推荐

  1. 基于蜜獾家族MCF公链系统发行加密数字钱包教程

    基于蜜獾家族MCF公链系统发行加密数字钱包教程 ​区块链是互联网的第二个时代! 每一个企业都需要一个网站,一个公众号,一个企业自己创建的钱包,拥抱未来. 此教程手把手教你如何搭建自己企业的区块链加密钱 ...

  2. 让POW的共识机制不再成为公链系统吞吐率的瓶颈 | Conflux CTO伍鸣

    Conflux项目的CTO 伍鸣在Odaily星球日报与36Kr集团共同主办的2018 P.O.D New BlockTrend新区势区块链峰会上就<Scaling Nakamoto Conse ...

  3. 【转】互操作性的区块链系统设计理念

    论文 Towards a design philosophy for interoperable blockchain systems 的介绍和评论.有兴趣的读者可以使用超链接查看论文. 从前因为有了 ...

  4. 互操作性的区块链系统设计理念

    本文为作者对一篇论文 Towards a design philosophy for interoperable blockchain systems 的介绍和评论.有兴趣的读者可以使用超链接查看论文 ...

  5. 牛逼,一整套基于Java开发的的区块链系统(附完整源码)

    前言 近几年区块链概念越来越火,特别是区块链技术被纳入国家基础设施建设名单后,各大企业也开始招兵买马,对区块链技术进行研究,从各大招聘网站的区块链职位来看,薪资待遇都很不错,月薪30K到80K的都有, ...

  6. 基于Java开发一套完整的区块链系统(附源码)

    来源:https://blog.csdn.net/victory_long 前言 近几年区块链概念越来越火,特别是区块链技术被纳入国家基础设施建设名单后,各大企业也开始招兵买马,对区块链技术进行研究, ...

  7. 区块链系统简要架构和重点知识点梳理

    区块链几大核心: 分布式帐本,所需技术:微服务架构,高性能RPC通讯. 区块链是一种按照时间顺序将数据区块以顺序相连的方式组合成的一种链式数据结构,每一个数据链表可以看作账本.它由多个区块构成了一个有 ...

  8. 区块链系统其实是一个分布式数据库系统

    想知道更多关于区块链技术知识,请百度[链客区块链技术问答社区]链客,有问必答! 区块链系统和CAP.ACID和BASE 为什么我们说区块链系统其实是一个分布式数据库系统? ACID 传统的数据库都满足 ...

  9. IEEE P3217《区块链系统应用接口规范》国际标准启动

    在 2021 年 8 月 27 日召开的 IEEE P3217<区块链系统应用接口规范>国际标准启动媒体发布会上,上海树图区块链研究院宣布 IEEE<区块链系统应用接口规范>( ...

最新文章

  1. 大公司为什么都有API网关?没你想的那么简单!
  2. Windows Phone 7 中的页面和弹出框
  3. 天天象棋 残局闯关 第15关
  4. 一些让人恶心的代码片段
  5. 在vivado里用rtl描述_如何利用Vivado HLS处理许多位准确或任意精度数据类型
  6. 可复用的基于ARM的W5100底层驱动设计
  7. 30 网站项目建设流程概述
  8. qnap自带有mysql吗_关于威联通QNAP NAS应用—Container Station 容器套件
  9. 手动卸载office 2010 亲测有效
  10. docker打包部署微服务项目
  11. Unity拓展——菜单栏拓展
  12. python爬虫爬取图片详解_Python使用爬虫爬取静态网页图片的方法详解
  13. 托福高频真词List09 // 附托福TPO阅读真题
  14. python之dlib使用摄像头实时检测人脸
  15. xxxx.readyState==4 xxxx.status==200
  16. Java——博主的学习路线
  17. VBoxManager很强大哈!
  18. JarvisOJ level4
  19. 怎么用Python批量添加zabbix-host主机
  20. 推荐一款手机清理工具App悟空清理

热门文章

  1. MBTI职业性格测试和大五人格测试对比分析
  2. Ubuntu通过deepin-wine安装QQ(2022.7.20可用)
  3. element-ui中导航菜单默认激活子菜单的第一项
  4. 期货交易怎么买空(期货怎么买做空)
  5. 【视频分享】尚硅谷Java视频教程_Jenkins视频教程
  6. Comet OJ 夏季欢乐赛 完全k叉树
  7. jsx中文是什么牌子口红_cl口红是什么牌子 cl口红中文名字
  8. 阿里巴巴收购 Yahoo! 中国之后的风险
  9. 添加 右键显示隐藏文件+扩展名
  10. 运行时:Linux 和 Windows 2000上的高性能编程技术