B+树的python实现

本代码来自极客学院网站死里逃生2018年发表的blog关于 B+tree (附 python 模拟代码)。该代码实现了B+树的插入、删除、范围查找,功能完善,但也存在诸多问题。本文在原代码基础上对其错误进行了修正,更便于需要者使用。

主要贡献

  1. 将python3不支持的语法修改成python3支持的语法,主要是__cmp__修改成__lt__,gt
  2. 解决查找算法bug:当范围查询上界不存在时,原代码返回的查询结果会多出或者缺少元素。点查询则不受影响。
  3. 为B+树类添加了Size成员,可以查看B+树已插入的数据个数
  4. 提供了更多的测试,读者可以利用这些测试检查自己需要的功能是否可以正常实现
  5. 未修复问题:这个插入和删除算法有大问题,插入和删除算法都不会修改上层的键值,除非触发合并或者分裂。
## 使用须知:1. 查找必须得是范围查询,点查询也得是上下界相等才行;查找的返回值一定是一个由KV组成的数组,哪怕点查询
## 2. 删除必须得输入键-值对而不是只靠键,当然如果和查询配合使用还是挺好的;
## 3. 这个算法不会拒绝同一个键多次插入,哪怕值也是相同的也不会报错,照样按从左到右插入
## 4. 注意B+树叶节点不是一个个键值,而是多个键值组成的节点,节点之间才有指向邻居的指针from collections import dequedef bisect_right(a, x, lo=0, hi=None):"""Return the index where to insert item x in list a, assuming a is sorted.The return value i is such that all e in a[:i] have e <= x, and all e ina[i:] have e > x.  So if x already appears in the list, a.insert(x) willinsert just after the rightmost x already there.Optional args lo (default 0) and hi (default len(a)) bound theslice of a to be searched."""if lo < 0:raise ValueError('lo must be non-negative')if hi is None:hi = len(a)while lo < hi:mid = (lo + hi) // 2if x < a[mid]:hi = midelse:lo = mid + 1return lodef bisect_left(a, x, lo=0, hi=None):"""Return the index where to insert item x in list a, assuming a is sorted.The return value i is such that all e in a[:i] have e < x, and all e ina[i:] have e >= x.  So if x already appears in the list, a.insert(x) willinsert just before the leftmost x already there.Optional args lo (default 0) and hi (default len(a)) bound theslice of a to be searched."""if lo < 0:raise ValueError('lo must be non-negative')if hi is None:hi = len(a)while lo < hi:mid = (lo + hi) // 2if a[mid] < x:lo = mid + 1else:hi = midreturn loclass InitError(Exception):passclass ParaError(Exception):passclass KeyValue(object):__slots__ = ('key', 'value')def __init__(self, key, value):self.key = keyself.value = valuedef __str__(self):return str((self.key, self.value))def __eq__(self, other):if isinstance(other, KeyValue):if self.key == other.key:return Trueelse:return Falseelse:if self.key == other:return Trueelse:return Falsedef __ne__(self, other):if isinstance(other, KeyValue):if self.key != other.key:return Trueelse:return Falseelse:if self.key != other:return Trueelse:return Falsedef __lt__(self, other):if isinstance(other, KeyValue):if self.key < other.key:return Trueelse:return Falseelse:if self.key < other:return Trueelse:return Falsedef __gt__(self, other):if isinstance(other, KeyValue):if self.key > other.key:return Trueelse:return Falseelse:if self.key > other:return Trueelse:return Falseclass Bptree_InterNode(object):def __init__(self, M):if not isinstance(M, int):raise InitError('M must be int')if M <= 3:raise InitError('M must be greater then 3')else:self.__M = Mself.clist = []  # 如果是index节点,保存 Bptree_InterNode 节点信息#      leaf节点, 保存 Bptree_Leaf的信息self.ilist = []  # 保存 索引节点self.par = Nonedef isleaf(self):return Falsedef isfull(self):return len(self.ilist) >= self.M - 1def isempty(self):return len(self.ilist) <= (self.M + 1) // 2 - 1@propertydef M(self):return self.__Mclass Bptree_Leaf(object):def __init__(self, L):if not isinstance(L, int):raise InitError('L must be int')else:self.__L = Lself.vlist = []self.bro = Noneself.par = Nonedef isleaf(self):return Truedef isfull(self):return len(self.vlist) > self.Ldef isempty(self):return len(self.vlist) <= (self.L + 1) // 2  # 删除的填充因子@propertydef L(self):return self.__Lclass Bptree(object):def __init__(self, M, L):  # M为度, L为填充因子if L > M:raise InitError('L must be less or equal then M')else:self.__M = Mself.__L = Lself.__Size = 0self.__root = Bptree_Leaf(L)self.__leaf = self.__root@propertydef M(self):return self.__M@propertydef L(self):return self.__L@propertydef Size(self):return self.__Sizedef insert(self, key_value):node = self.__rootdef split_node(n1):mid = self.M // 2newnode = Bptree_InterNode(self.M)newnode.ilist = n1.ilist[mid:]newnode.clist = n1.clist[mid:]newnode.par = n1.parfor c in newnode.clist:c.par = newnodeif n1.par is None:newroot = Bptree_InterNode(self.M)newroot.ilist = [n1.ilist[mid - 1]]newroot.clist = [n1, newnode]n1.par = newnode.par = newrootself.__root = newrootelse:i = n1.par.clist.index(n1)n1.par.ilist.insert(i, n1.ilist[mid - 1])n1.par.clist.insert(i + 1, newnode)n1.ilist = n1.ilist[:mid - 1]n1.clist = n1.clist[:mid]return n1.pardef split_leaf(n2):mid = (self.L + 1) // 2newleaf = Bptree_Leaf(self.L)newleaf.vlist = n2.vlist[mid:]if n2.par == None:newroot = Bptree_InterNode(self.M)newroot.ilist = [n2.vlist[mid].key]newroot.clist = [n2, newleaf]n2.par = newleaf.par = newrootself.__root = newrootelse:i = n2.par.clist.index(n2)n2.par.ilist.insert(i, n2.vlist[mid].key)n2.par.clist.insert(i + 1, newleaf)newleaf.par = n2.parn2.vlist = n2.vlist[:mid]n2.bro = newleafdef insert_node(n):if not n.isleaf():if n.isfull():insert_node(split_node(n))else:p = bisect_right(n.ilist, key_value)insert_node(n.clist[p])else:p = bisect_right(n.vlist, key_value)n.vlist.insert(p, key_value)self.__Size += 1if n.isfull():split_leaf(n)else:returninsert_node(node)def search(self, mi=None, ma=None):result = []node = self.__rootleaf = self.__leafif mi is None and ma is None:raise ParaError('you need to setup searching range')elif mi is not None and ma is not None and mi > ma:raise ParaError('upper bound must be greater or equal than lower bound')def search_key(n, k):if n.isleaf():p = bisect_left(n.vlist, k)return (p, n)else:p = bisect_right(n.ilist, k)return search_key(n.clist[p], k)if mi is None:while True:for kv in leaf.vlist:if kv <= ma:result.append(kv)else:return resultif leaf.bro == None:return resultelse:leaf = leaf.broelif ma is None:index, leaf = search_key(node, mi)result.extend(leaf.vlist[index:])while True:if leaf.bro == None:return resultelse:leaf = leaf.broresult.extend(leaf.vlist)else:if mi == ma:i, l = search_key(node, mi)try:if l.vlist[i] == mi:result.append(l.vlist[i])return resultelse:return resultexcept IndexError:return resultelse:i1, l1 = search_key(node, mi)i2, l2 = search_key(node, ma)if l1 is l2:if i1 == i2:return resultelse:if l2.vlist[i2] == ma:## 解决了上界ma不存在于B+树中时出错的问题result.extend(l1.vlist[i1:i2 + 1])return resultelse:result.extend(l1.vlist[i1:i2])else:result.extend(l1.vlist[i1:])l = l1while True:if l.bro == l2:if l2.vlist[i2] == ma:result.extend(l2.vlist[:i2 + 1])return resultelse:result.extend(l2.vlist[:i2])return resultelse:result.extend(l.bro.vlist)l = l.brodef traversal(self):result = []l = self.__leafwhile True:result.extend(l.vlist)if l.bro == None:return resultelse:l = l.brodef show(self):print'this b+tree is:\n'q = deque()h = 0q.append([self.__root, h])while True:try:w, hei = q.popleft()except IndexError:returnelse:if not w.isleaf():print(w.ilist, 'the height is', hei)if hei == h:h += 1q.extend([[i, h] for i in w.clist])else:print([v.keyfor v in w.vlist], 'the leaf is,', hei)def delete(self, key_value):def merge(n, i):if n.clist[i].isleaf():n.clist[i].vlist = n.clist[i].vlist + n.clist[i + 1].vlistn.clist[i].bro = n.clist[i + 1].broelse:n.clist[i].ilist = n.clist[i].ilist + [n.ilist[i]] + n.clist[i + 1].ilistn.clist[i].clist = n.clist[i].clist + n.clist[i + 1].clistn.clist.remove(n.clist[i + 1])n.ilist.remove(n.ilist[i])if n.ilist == []:n.clist[0].par = Noneself.__root = n.clist[0]del nreturn self.__rootelse:return ndef tran_l2r(n, i):if not n.clist[i].isleaf():# 将i的最后一个节点追加到i+1的第一个节点n.clist[i + 1].clist.insert(0, n.clist[i].clist[-1])n.clist[i].clist[-1].par = n.clist[i + 1]# 追加 i+1的索引值,以及更新n的i+1索引值## n.clist[i + 1].ilist.insert(0, n.clist[i].ilist[-1])## n.ilist[i + 1] = n.clist[i].ilist[-1]## n.clist[i].clist.pop()## n.clist[i].ilist.pop()# edit:n.clist[i + 1].ilist.insert(0, n.ilist[i])n.ilist[i] = n.clist[i].ilist[-1]n.clist[i].clist.pop()n.clist[i].ilist.pop()else:n.clist[i + 1].vlist.insert(0, n.clist[i].vlist[-1])n.clist[i].vlist.pop()n.ilist[i] = n.clist[i + 1].vlist[0].keydef tran_r2l(n, i):if not n.clist[i].isleaf():n.clist[i].clist.append(n.clist[i + 1].clist[0])n.clist[i + 1].clist[0].par = n.clist[i]n.clist[i].ilist.append(n.ilist[i])n.ilist[i] = n.clist[i + 1].ilist[0]n.clist[i + 1].clist.remove(n.clist[i + 1].clist[0])n.clist[i + 1].ilist.remove(n.clist[i + 1].ilist[0])else:n.clist[i].vlist.append(n.clist[i + 1].vlist[0])n.clist[i + 1].vlist.remove(n.clist[i + 1].vlist[0])n.ilist[i] = n.clist[i + 1].vlist[0].keydef del_node(n, kv):if not n.isleaf():p = bisect_right(n.ilist, kv)if p == len(n.ilist):if not n.clist[p].isempty():return del_node(n.clist[p], kv)elif not n.clist[p - 1].isempty():tran_l2r(n, p - 1)return del_node(n.clist[p], kv)else:return del_node(merge(n, p - 1), kv)else:if not n.clist[p].isempty():return del_node(n.clist[p], kv)elif not n.clist[p + 1].isempty():tran_r2l(n, p)return del_node(n.clist[p], kv)else:return del_node(merge(n, p), kv)else:p = bisect_left(n.vlist, kv)try:pp = n.vlist[p]except IndexError:return -1else:if pp != kv:return -1else:n.vlist.remove(kv)self.__Size -= 1return 0del_node(self.__root, key_value)@propertydef leaf(self):return self.__leafdef test():mini = 2maxi = 60testlist = []for i in range(1, 10):key = ivalue = itestlist.append(KeyValue(key, value))mybptree = Bptree(4, 4)for kv in testlist:mybptree.insert(kv)mybptree.delete(testlist[0])mybptree.show()print('\nkey of this b+tree is \n')print([kv.key for kv in mybptree.traversal()])print([kv.key for kv in mybptree.search(mini, maxi)])#if __name__ == '__main__':
#    test()## B = Bptree(4,4)
## for i in range(10):
##     B.insert(KeyValue(1 + i, i**2))# kv = B.search(10, 12)
# print(len(kv))## kvs = B.traversal()
## for k_v in kvs:
##     print(k_v)
## print("---------")
## B.insert(KeyValue(0, 0))
## B.insert(KeyValue(1, 10))
## kvs = B.traversal()
## for k_v in kvs:
##     print(k_v)## leaf = B.leaf
## smallest = B.leaf.vlist[0]
## print(smallest.key)## B.insert(KeyValue(0,-1))## leaf = B.leaf
## smallest = B.leaf.vlist[0]
## print(smallest.key)B = Bptree(4,4)
print("---检查插入算法---")
for i in range(10):B.insert(KeyValue(1 + i, i**2))
B.insert(KeyValue(0, -1))
print(B.Size)print("---检查Size变量是否准确---")
B.delete(KeyValue(6, 25))
print(B.Size)print("---检查跨块搜索是否正常---")
result = B.search(3,6)
for kv in result:print(kv)print("---检查范围查询在键值不连续时是否正常---")B.delete(KeyValue(7, 36))# B.insert(KeyValue(6, 25))
# B.insert(KeyValue(7, 36))result = B.search(3,8)for kv in result:print(kv)

B+树的python实现相关推荐

  1. 智慧树python程序设计答案_智慧树知道Python程序设计完整答案

    智慧树知道Python程序设计完整答案 更多相关问题 已知函数f(x)=a(1-2|x-12|),a为常数且a>0.(1)f(x)的图象关于直线x=12对称:(2)若x0满足f(f(x0))=x ...

  2. python交互式程序设计导论第一周答案_智慧树知到Python程序设计基础见面课测试答案...

    智慧树知到Python程序设计基础见面课测试答案 更多相关问题 [多选题]用TLC法检查药物中杂质时,通常有以下几种方法 [判断题]月氏从故地到迁入地,均在丝绸之路沿线 [单选题]若炽灼残渣留做重金属 ...

  3. 知到智慧树python答案2020_参考答案2020智慧树(知到)Python程序设计

    参考答案2020智慧树(知到)Python程序设计 更多相关问题 水池满水试验时,正确的注水方法是(). A. 相邻两次注水间隔时间不应少于48h B. 注水分三次 米跨度的起重机是标准起重机.A.1 ...

  4. GBRT(梯度提升回归树)python实现

    GBRT(梯度提升回归树)python实现 文章目录 GBRT(梯度提升回归树)python实现 前言 一.什么是梯度提升回归树是什么? 二.使用步骤 1.不多说直接上代码 2.建立模型 总结 前言 ...

  5. python开发区_最新章节测试答案2020智慧树知道Python程序设计

    最新章节测试答案2020智慧树知道Python程序设计 军事新闻 2020-09-05 00:52128未知admin 最新章节测试答案2020智慧树知道Python程序设计 更多相关问题 This ...

  6. 哈希树的python实现

    一.问题的背景 给定一组商品购买信息,找到商品购买中频繁出现的商品集.比如说,我们有如下的商品交易信息: 市场购物信息 Tip Items 1 Bread, Milk 2 Bread, Diaper, ...

  7. python 打印xml文档树_[Python]xml.etree.ElementTree处理xml文档

    需求: 在实际应用中,需要对xml配置文件进行实时修改, 1.增加.删除 某些节点 2.增加,删除,修改某个节点下的某些属性 3.增加,删除,修改某些节点的文本 xml源文件格式[例] path=&q ...

  8. 机器学习笔记——kd树及python实现

    kd树 实现k近邻时当训练数据量较大时,采用线性扫描法(将数据集中的数据与查询点逐个计算距离比对)会导致计算量大效率低下.这时可以利用数据本身蕴含的结构信息,构造数据索引进行快速匹配.索引树便是其中常 ...

  9. K近邻法之kd树及其Python实现

    作为机器学习中一种基本的分类方法,K近邻(KNN)法是一种相对简单的方法.其中一个理由是K近邻法不需要对训练集进行学习.然而,不需要对训练集进行学习,反过来也会造成对测试集进行判定时,计算与空间复杂度 ...

最新文章

  1. 解决 WIn7 启动时“你有等待写入光盘的文件”
  2. 安卓 内存泄漏检测工具 LeakCanary 使用
  3. 第三十八讲:tapestry Ajax 关联下拉选框(select)组件
  4. 2022年全球及中国金属摩托车车轮市场竞争格局与供需前景调研报告
  5. 树与二叉树的深度优先与广度优先算法(递归与非递归)
  6. 查看mongodb数据路径_Mac OS 中安装和使用 MongoDB 的方法
  7. 2019\National _C_C++_C\试题 B: 递增序列
  8. 如何使用SSH连接到远程MySQL服务器
  9. 2018 C语言大作业--21_Ekko制作教程
  10. Facebook 游戏开发更新文档 API 参考文档 v5.0
  11. 云计算大佬必看|IDC主机销售管理系统详细对比评测
  12. 网络显示其他计算机不全,win7系统网络邻居显示不全只能看到2台计算机的解决方法...
  13. 悲催:一个80后程序员的爱情故事【视频】-但愿我不是那个陈旭阳!55...
  14. vsftpd参数cmds_allowed
  15. python如何横向输出_python数据竖着怎么变横的?
  16. PostgreSQL汉字转拼音
  17. uni-app实现app内嵌微信文章
  18. 基于swoole的网页一对一实时聊天
  19. SQL中的笛卡尔你真的懂吗?
  20. 超融合一体机 oracle,oracle的pca私有云一体机超融合解决之道.pdf

热门文章

  1. windows正版验证问题。。。
  2. 米酷影视6.2.8完整版(仿首涂模板+四套首页模板)
  3. 思科---防火墙asa5520配置笔记
  4. php 开源程序_国内PHP开源建站程序一览
  5. 重装系统 U盘安装 提示Windows检测到EFI系统分区格式化为NIFS,将EFI系统分区格式化为FAT32,然后重新启动安装
  6. 学习笔记 偶数的个数
  7. Windows 2012 R2安装KB2919355失败,需先安装KB2919442
  8. matlab 硬阈值,小波变换 软硬阈值半软阈值图像去噪matlab程序
  9. 计算机电源线上的整流器,开关电源的输入输出滤波设计
  10. VC多线程中控制界面控件的几种方法