前言:

LDPC码是麻省理工学院Robert Gallager于1963年在博士论文中提出的一种具有稀疏校验矩阵的分组纠错码。

为什么要稀疏校验呢,这个需要从校验码历史发展开看。

这段是LDPC发明者原话:

I invented LDPC about  60 years ago.there are Nelson those cellphones
then data transmission was painfully slow expensive and error-prone.
         my main computer at a computational power,halfway between a modern cell 
phone and a coffee machine . error correction coding was fascinating theoretical
problem that studied by many mathematicians and engineers.
        simple schemes were ineffective but complex schemes were too expensive for 
1960 technology.
       LDPC  was an interesting midpoint cute and interesting to theoreticians but thirty five years before technological feasibility.
       that's the way research is and the way it should be hard research problem take 
years to solve and should not be overly dependent on details of current technological.
        technological capabilities and research projects both proceed at their own pace
and applications  ,resolve when both are ready .

thank you

上世纪60s年代发明了LDPC 码。那个时候数据传输一方面特别慢,另一方面容易出错。

在60年代,简单的调度没有什么用,复杂的调度又太昂贵. LDPC理论码是特别有意思的,但是直到35年后才真正应用起来。

Polar 思想是:    信息传输的时候不要出错

LDPC 思想:    信息出错了不要紧,我把它纠正过来。


  一   问题起源

通常我们传输数据的时候,是肯定会出错的。发送的bits流

可能会在某一位0 变成1 ,1变成0,也可能被erase掉了,变成一个不确定的状态。

这里主要讨论erase状态,也就是BEC信道

often we need to transmit sensitive  data, such as banking information where
every symbol in the message  is vital . one mistake can corrupt, the entire message and
no matter how we communicate . we face this problem one kind of corruption  occurs when a bit flips the transmitter sends a 1 but recived as 0 ,or a 0 which is received as a 1 .

these are called  errors, another kind of corruption is an erasure which occurs when the bits are so distorted ,they're considered unknowns or  blanks in the message for this video we will  discuss this  problem in terms  of erasers ,because it simplifies the main ideas while some communication suffer from less  corruption than others .

it's important to note that corruptions are bound to occur, no channels are truly noiseless . so the problem communication engineers will always face ,is how can  we  transmit our messages perfectly when  we know some symbols will be lost in transmission


二    repetion code 

最早使用repetion code方式

如场景1:

一个bits copy 一份,同时发送两份:。

接收方收到bit流时,逐个比特对比,当其中一个被erase掉了,就用另一个的值。

, 但是也会发生场景2的情况,同一个位置,同时被erase掉了。

也可以理解每个bit都需要一个bit 做奇偶校验。

如场景2:

如果两个copy里面同一个bit 都发生错了 ,上面依然会校验失败.

解决方法:

也可以重复发3或者4次,但是又带来另一个问题,码率降低了

k:信息bit ,n 为重复次数。

这里面就有个矛盾 :

码率 和   降低解码失败概率

realize we all face this problem during ordinary voice communication,when a word
is misheard in a conversation to recover from this corruption .we simply repeat ourselfs.

technically this is called a repetion code, it's simplest error correction code, for example if we have a binary message ,it would be encoded before transmission into two copies of itself . this might received, for example,if we have a binary message it would be encoded before transmission into two copies itself this might be received. for example with some erasers as follows,these erasers can easily be decoded by comparing the two copies bit  by bit.

thus the message would be decoded correctly this compare operation is simple and thus can easily keep us with the speed of communication ,unfortunately simple repetion fails when the same bit is erased in both copies and because this quite common this strategy  leaves us very vulnerable  to failure ,also it doubles the message length no matter the number erasers we express this efficiency  [ɪˈfɪʃnsi] .

issue in terms of code rate  where code rate is the orignial  divied by the total number  of transmitted bits thus the code rate in this repetition scheme  is 50% that means half of the bis are  need to describle our original message and the other half are there 
to protect the message if we use triple repetion  [ˈtrɪpl]  三部分的; the code rate would drop to  1/3.   as of the rate drops the protection increases but there are fewer message sent so a lower code rate is a more expensive design since bit transmit has some cost so there is a balance to strike between code rate and probality of failure we  need a coding strategy with just enough protection  bits to prevent failure for any channel whether we expect a high or lowe erasure any channel whether we expect a high or low erasure rate while also maintaining fast code and decode operations to solve this problem


三  parity-check 奇偶校验 和二项分布

repetion code ,每个bit 只会保护一个bit.

效率比较低,后来到50年底出现了奇偶校验码

3.1 单奇偶校验

如上图,校验码为1个bit, (或者求出其中1的个数,如果为奇数 p=1)

最后组成一个5bit的code word 发送出去

 很多外文把 集合叫做一个set

例子:

编码过程:

要发送的信息为 1 000 11 ,加上一个奇偶校验位1(1的个数和为3,为奇数)

解码过程:

当接收方收到数据 100?11 1 时. 因为其 ,

可以直接解码出原码为 1000 11

,是不是高很多。

问题:

  当超过一个bit 被 erase ,就无法解码了

 结合二项分布,p为0.5, 这种只能纠正10%的场景。 (传输7个bit,只有一个出错,其它都正确的概率为10%)

# -*- coding: utf-8 -*-
"""
Created on Wed Apr 27 15:39:14 2022@author: chengxf2
"""
import numpy as np
from scipy.stats import binom
import matplotlib.pyplot as pltn = 6
x = np.arange(n)
p =0.5#计算每次观测的二项分布概率密度函数为: stats.binom.pmf(k,n,p) 其中n 表示测量次数,
# k 是目标结果的次数, p 为抽样概率。输出为k个元素的数组,数组的每个元素表示出现k次目标结果的概率。
y = binom.pmf(x,n,p)
plt.plot(x, y, 'o-', label='p=0.5')plt.legend()
plt.title( 'p' )
plt.show()
print('当p不同时,成功m次的能性的最大值都出现在均值处,对应概率为n*p')

3.2 多奇偶校验位

这里举例解码,当其被erase掉:

单奇偶校验简单的理解:

  只受到一个set的保护.

如果要解码成功这个,需要全部传输成功,

假设条件独立:

假设都是50%的概率erase,解码成的概率:

多奇偶校验:

同样其解码成功的概率为

=0.669

其中

: 是任意的一个set中,其解码

失败的概率

  可以看到增加其set的数量,会提高其解码成功率,

但也会带来码率的降低

need to be smarter about how we use the protection bits first realize with repetition codes each bit is only  protecting one bit of information instead we could define a single protection bit that protects the entire message。

from a single erasure no matter where it occurs to to this we use an encoder which counts the number of ones in the message and then chooses a protection bit to ensure that the total message is even .

this way we call this protection bit a parity bit the sender includes this 
parity at the end of the message now consider what happens when a  single erasure during transmission.the receiving decoder can correct the erasure with a simple operation it counts the number of ones in the received message ,and so it know the erased bit must have been a 1.

notice that a single parity-check bit yields quite an improvement to the code rate compared  repetition coding,however the chance of failure has increased compared to repetition with a single parity-check bit.
         we could only correct a single erasure and if two of more erasers occur the code fails because there isn't enough information to reconstruct the bits to decrease the chances of failure we can divied up the message into distinct sets and include multiple parity check bits where now each parity check bit protects a different set of bits from a single erasure .we call each of these sets  a parity check set and as we add more parity check bits we increase the number of erasure as we can potentially correct ,at the cost of gradually decreasing ,the code rate in general.

the number of  parity should be at least equal to the number of erases if we want to have a chance of correcting them all
       however we face the same problem with this approach  [əˈproʊtʃ]  方法 the code fails if two or more  erasers occur in any parity check set in that case the bits can't be recoverd.


四   Hamming 码

1950 年  Richard Hamming 找到了一种简单的方法纠错多个错误。当传输一个序列,

消息bit为4个,3个为校验码,可以纠正任意两个erasers.

这是一种重叠法解码思想(overlapping approach)

先解码集合A

解码出的bit 集合可能有部分跟 集合B 重叠

去更新 集合B重叠的部分

这个时候多个集合B中的子集就有可能变成集合A中的子集

每次解码时候,里面只能有一个bit是erase

如下图 被erase掉后,可以基于 校验子集分别纠正出

erase 状态。

问题:

重叠码思想后来有大量的数学家以及工程师在这个方向做了研究

但是把这个扩展到数以千计的bit,我们遇到新的问题,

这是因为对于较长的消息,大多数人认为我们可以使用更大的重叠子集,但是有许多大的重叠子集将奇偶校验位的解码过程从一个简单的操作变成了一个复杂的谜题,这是因为当子集非常大时,得到只有一个擦除的子集的机会非常小.

这防止了级联效应的开始或继续,相反,奇偶校验位必须使用更复杂的解码操作来解决,因为现在你要解决一个方程组,这个方程组随着问题数量的增加而变得越来越复杂,但这些更复杂。解码操作会将通信过程减慢到不再复杂的程度

实际使用我们需要的是一个具有非常快速解码操作的代码,所以我们剩下的最后一个问题是,我们如何扩展这种重叠子集方法,以便在非常长的消息上快速工作,从而能够处理我们遇到的任何擦除率,同时保持良好的代码速率。

是不是跟Polar 码一样,要提高解码成功率,用SCL,但是又会影响解码速度,带来一系列的问题。

1950 Richard Hamming looking at simple ways of correcting multiple errors when transmitting  a sequence of bits。 his trick was to use overlapping parity check sets which made it possible to correct  any two erasers as a 4-bit message and three parity check bits each parity check bit covers three message bits now no matter where to  erasers occur the code will always be able to correct in this case.
       we can't recover the first erasure right away since both both black and blue sets contain
two erasers .the second erasure can be recoverd first beacuse the read set has one erasure and  this provides the information we need  to recover the first bit this  casecade 
( effect where one correction allows us to solve another is very important insight ,
because instead of correcting all erasers  in one step we can remove erasers in multiple passsesor iterations it's important to note that this kind of cascade effect only works if we have at least one set with only one erasure at each step ,otherwise it would fail .

this  led to a decade of excellent research by many people built on this idea of overlapping parity sets much of this work was aimed at finding the right structure for the parity check set arrangement 。

but when we extended this  overlapping approach to very long messages with many thousands of bits we ran into a problem。this  because with longer messages most assumed that we could use larger overlapping subsets,but having many large overlapping subsets turns the process of decoding the parity check bits from a simple  operation into a complex puzzle。

this is because when the subsets are very large the chance of getting a subset in which there is only one erasure is very small ,and this prevents that cascade effect from starting or continuing, instead the parity check bits must be solved using a more complex decoding operation because now you are faced with solving a system of equations which becomes more and more complex as the number of questions grows ,but these more complex。 decoding operations slow down the communication process to the point where it's no longer 
         practical to use what we need is a code with very fast decode operations ,so we are left with a final  problem how can we extend this overlapping subset approach to work quickly over very long messages in a way that can handle any erasure rate we encounter while maintaining a good code Rate .


五 LDPC code

1960 Robert Gallagher 发明了一种新的方法解决通讯编码问题(码率和重叠码冲突问题)

在数据存储以及无线网络中广泛使用。

算法的核心是使用小的奇偶校验集,小的奇偶校验集相对大的奇偶校验集里面的

只有一个erase概率更高(可以通过二项分布计算),可以充分利用硬件的并行计算能力,

同时解决多个简单的问题,而不是一个复杂的问题。

一个消息bit 关联的奇偶校验位越多,密度越高.后面也用Tanner图来表示 ,他的研究受到 Peter Elias的启发。

LDPC 的模型,编码,解码后面会系统的介绍一下。

in 1960 a doctoral thesis by Robert Gallagher discovered a new approach to the communication coding problem,which is in wide use today particularyly   in cellular networks and data storage applications. 
     Gallagher combined to key in the first was to use smaller parity-check sets instead of larger ones this improves the chances that many parity-check sets will contain only a single erasure and enables the casecade effct to happen this was key to keeping the decoding speed fast over long message because it's faster for a computer to solve many simple problems instead of a single complex problem. which solves all of them at once.
      Gallaher realized that we need an amount of overlap that is large
enough to protect each bit well, but not so large that the correction problem 
becomes very complex he measures the amount of overlap by something he calls 
density that is how densely the message bits are connected to parity
check bits
with larger subsets we have a high density that is each message bit is connected to many parity check bits 。

with smaller subsets we have a lower density that is 
each message bit is connected to fewer parity check bits and this is where the name 
low-density parity-check codes comes from,。

but aside from the size of the sets there is also the problem of how to structure the connection between parity and message bits and that leads to his secode insight which was inspired by the work
of Peter Elias, who had recently shown that looking for the right structure
for large subsets was relatively unimportant instead a random subset arrangement 
this simplified the problem compared to trying to design an idea structure for 
the subsets ,and remarkably it did very well and that's key idea behind low-density 
parity-check codes use small overlapping randomly assigned parity check sets this 
allows us to solve eraser in a rapid casecade of simple operation no matter 
how long the meessage or what the erasure rate is while maintaining a fast decoding
speed ,but there is one last insight to understand when it came to the actual
construction of these codes recall that Hammonds construction foucused on
protecting each message bit with multiple parity check in order to better protect 
those bits in case of multiple erasers the problem is this dosent's proivde
the same protection to parity bits which can also be lost during transmission 
notice that each parity check bit is itself only connected to one parity check 
set since any bit that is connected to only one parity check set is vulnerable,
ˈvʌlnərəbl]  脆弱的,this leaves all parity check bits vulnerable.
this can make correction impossible,for example this would happen when one message
bit is erased and two parity bits it's connected to is also erased,
in this case we are stuck。

because the casecade can't happen and over very long messages
with many small parity check sets this kind of failure would become more and 
more likely to happen, the solution to this dilemma ,is to instead
view things in a more general way and  protect the parity bits and the 
message bits equally,that is we make sure all bits message and parity 
are contained in multiple check sets and now we need to understand the basics of
low-density parity-check codes or LDPC codes to see this in action , lets's do a 
very simple example now say our message is  101,recall that we chose our 
subsets randomly,to generate the code ,encoding is now done by finding a
collection of parity bits so all the check sets have an even number of ones this
involves solving a simple  set of linear equations ,
        now imagine our message is received with 3 erasers as follows to decode

this message we begin with the check sets that have a single erasure 
first such as this one and we repeat this process for every check bit
until there are no more erasers left that's one way iterative can work using 
low-density parity-check codes it's simple fast reliable,and can be extended
to messages of any length ,today many communication devices use LDPC codes
and versions of LDPC codes are being proposed for fifth generation wireless
system known as 5g but important to note that when these coats came out they 
did not have a big impact at first and  were almost forgotten for many years

LDCP 02 - 奇偶校验码发展史相关推荐

  1. 2020软件测试自学全套教程-基于python自动化软件测试-2020新版软件测试中级程序员学习路线

    不知不觉间,在软件测试行业野蛮的折腾了七年之久.七年之痒也即将过去,但我还是热爱着软件测试这一份工作,一路坚持,走到现在.经历过各种难题,有过迷茫,有过焦虑失眠.也踩过无数的坑,深知行业的不易.自从9 ...

  2. 2022软件测试自学全套教程-基于python自动化软件测试-2022新版软件测试中级程序员学习路线

    不知不觉间,在软件测试行业野蛮的折腾了七年之久.七年之痒也即将过去,但我还是热爱着软件测试这一份工作,一路坚持,走到现在.经历过各种难题,有过迷茫,有过焦虑失眠.也踩过无数的坑,深知行业的不易.自从9 ...

  3. 一篇有点长的HTTP协议详解

    一.HTTP简介 01.什么是HTTP HTTP(HyperText Transfer Protocol ,超文本传输协议),是一个基于请求与响应的,无状态的,应用层的协议,常基于TCP/IP协议传输 ...

  4. 艾永亮:一瓶酱油市值超过中石化!海天味业到底有多牛?

    他成功地把"一瓶酱油",变成了一款超级产品. 01 酱油界的茅台 2020年9月3号,海天味业股价到达203元高位,市值成功突破6500亿,远远超过4800亿的中国石化,逼近780 ...

  5. 【Linux】一步一步学Linux——Unix发展史(02)

    目录 00. 目录 01. 请参考Unix传奇 02. 03. 00. 目录 @ 本博客后面会更新 01. 请参考Unix传奇 链接: https://coolshell.cn/articles/23 ...

  6. 数仓建模—数仓架构发展史(02)

    发展史 时代的变迁,生死的轮回,历史长河滔滔,没有什么是永恒的,只有变化才是不变的,技术亦是如此,当你选择互联网的那一刻,你就相当于乘坐了一个滚滚向前的时代列车,开往未知的方向,不论什么样的技术架构只 ...

  7. 02,计算机组成与操作系统的理解、对程序的理解、OS发展史、Linux常见版本 以及 一些杂项...

    这里马哥讲了好多,能听懂,但需要记得东西好少,不过我还是 按照 我 对 计算机 的 理解 画了 一张 图, 不一定 准确 , 但是 差不多 把, 计算机 底层 是 由 运算器.控制器.存储器.输入设备 ...

  8. 电脑的发展史_新时代新征程,新华电脑教育母公司中国东方教育成功港股上市!...

    中国最大的职业技能教育提供商中国东方教育控股有限公司(股票代码:http://667.HK),6月12日上午9点30分在港交所成功上市,成为港股教育板块中最大的职业技能培训企业. 作为中国东方教育旗下 ...

  9. 必知!4张图看尽AI发展史重大里程碑

    导读:本文主要是人工智能的基本概述,包括人工智能的起源和发展,以及基于机器学习和深度学习的下一代人工智能. 4张配图(图1至图4)是本文重点,是了解人工智能必备的基础知识. 作者:王健宗 瞿晓阳 来源 ...

最新文章

  1. VTK:Snippets之ViewportBorders
  2. 什么样的程序员会让人讨厌?大家注意,千万不要成为这种程序员!
  3. 刷题-字符串-Java
  4. Python安装常见问题:ModuleNotFoundError: No module named ‘_ctypes‘ 解决办法
  5. 修改数据表DataTable某一列的类型和记录值
  6. 小米路由器mini 刷潘多拉固件教程
  7. 在Oracle Data Guard环境中的RMAN备份和还原
  8. python自动化实现QQ自动发说说
  9. 笔记:常见TCP端口号
  10. 安全运营和应急响应详解
  11. 快速迭代内部学习心得
  12. H3C基础配置文档抄录10-ACL和QoS配置
  13. 2022 年牛客多校第五场补题记录
  14. 8代9代cpu平台改换win7的实践经验!
  15. 大气辐射学期中知识点总结
  16. 今日推荐:2068个开源的网站模板【免费下载】
  17. 执行xpath时提示,需要命名空间管理器或XsltContext。此查询具有前缀、变量或用户定义的函数
  18. NoSQL数据库资料
  19. 25A - IQ test
  20. STM32跑MicroPython的链接脚本分析

热门文章

  1. BSON及BSON和JSON的区别
  2. 从91、搜狗、UC看BAT的投资眼光,谁更技高一筹?
  3. habit list android,Habit List,助你养成好习惯
  4. CSDN博文大赛赛况简报
  5. 知名mac软件资源站macwk关站,倒闭原因令人叹息
  6. 计算机审计应用的心得体会,计算机审计中级培训心得体会范文
  7. CSS基础-01-基础认知
  8. android设备绑定微软,Android 7.1.1 手机如何登录微软账号
  9. 【SAP-CO】成本中心实际过账
  10. Java获取播放文件的时长