前言


在HDFS中,每时每刻都在进行着大量block块的创建和删除操作,这些庞大的block块构建起了这套复杂的分布式系统.普通block的读写删除操作一般人都或多或少了解过一些,但是过量的副本清理机制是否有人知道呢,就是overReplicatedBlock的处理,针对过量的副本块,HDFS怎么处理,何时处理,处理的策略机制如何,本文就给大家分享HDFS在这方面的知识.

过量副本块以及发生的场景


过量副本块的意思通俗解释就是集群中有A副本3个,满足标准的3副本策略,但是此时发生了某种场景后,A副本块突然变为5个了,为了达到副本块的标准系数3个,系统就会进行多余2块副本的清除动作,而这个清除动作就是本文所要重点描述的.过量副本块的现象是比较好解释的,那么问题来了,到底有哪些潜在的原因或条件会触发多余副本块的发生呢(在此指的是HDFS中)?本人通过对HDFS源码的阅读,总结出一下3点

  • ReCommission节点重新上线.这类操作是运维操作引起的.节点下线操作会导致大量此节点的block块在集群中大量拷贝,一旦此节点取消下线,之前已拷贝的大量块必然会成为多余的副本块.

  • 人为重新设置block replication副本数.还是以A副本举例,A副本当前满足标准副本数3个,此时用户张三通过使用hdfs的API方法setReplication人为设置副本数为1.此时也会早A副本数多余2个的情况,即使说HDFS中的副本标准系数还是3个.

  • 新添加的block块记录在系统中被丢失.这种可能相对于前2种case的情况,是内部因素造成.这些新添加的丢失的block块记录会在BlockManager进行再次扫描检测,防止出现过量副本的现象.

OK,以上3种情形就是可能发生过量副本块的原因.至于这3种情况是如何一步步的最终调用到处理多余副本块的过程在后面的描述中会再给出,先来看下多余副本块是如何被选出并处理掉的.

OverReplication多余副本块处理


多余副本块的处理分为2个子过程:

  • 多余副本块的选出
  • 选出的多余副本块的处理

我们从源码中进行一一寻找原因,首先副本块的选出.

多余副本块的选择


进入blockManager的processOverReplicatedBlock方法,很显然,方法名已经表明了方法操作的本意了.

<code class="hljs java has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;"><span class="hljs-javadoc" style="color: rgb(136, 0, 0); box-sizing: border-box;">/*** Find how many of the containing nodes are "extra",          if any.* If there are any extras, call chooseExcessReplicates() to* mark them in the excessReplicateMap.*/</span>
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">private</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">void</span> <span class="hljs-title" style="box-sizing: border-box;">processOverReplicatedBlock</span>(<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> Block block, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">short</span> replication, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> DatanodeDescriptor addedNode, DatanodeDescriptor delNodeHint) {</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>

此方法的注释的意思是找出存在”多余”的节点,如果他们是多余的,调用chooseExcessReplicates并标记他们,加入加入到excessReplicateMap中.下面进行细节的处理

<code class="hljs oxygene has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 节点列表变量的声明</span>
Collection<DatanodeStorageInfo> nonExcess = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">new</span> ArrayList<DatanodeStorageInfo>();
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 从corruptReplicas变量中获取是否存在坏的block所在的节点</span>
Collection<DatanodeDescriptor> corruptNodes = corruptReplicas.getNodes(<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">block</span>);</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li></ul>

继续后面的处理

<code class="hljs actionscript has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 遍历此过量副本块所在的节点列表</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span>(DatanodeStorageInfo storage : blocksMap.getStorages(block, State.NORMAL)) {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> DatanodeDescriptor cur = storage.getDatanodeDescriptor();...LightWeightLinkedSet<Block> excessBlocks = excessReplicateMap.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">get</span>(cur.getDatanodeUuid());<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 如果在当前过量副本图对象excessReplicateMap中不存在</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (excessBlocks == <span class="hljs-literal" style="color: rgb(0, 102, 102); box-sizing: border-box;">null</span> || !excessBlocks.contains(block)) {<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//并且所在节点不是已下线或下线中的节点</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (!cur.isDecommissionInProgress() && !cur.isDecommissioned()) {<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 并且这个副本块不是损坏的副本块</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// exclude corrupt replicas</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (corruptNodes == <span class="hljs-literal" style="color: rgb(0, 102, 102); box-sizing: border-box;">null</span> || !corruptNodes.contains(cur)) {<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 将此过滤副本块的一个所在节点加入候选节点列表中</span>nonExcess.add(storage);}}}}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li></ul>

所以从这里看出nonExcess对象其实是一个候选节点的概念,将block副本块所在的节点列表进行多种条件的再判断和剔除.最后就调用到选择最终过量副本块节点的方法

<code class="hljs vhdl has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">chooseExcessReplicates(nonExcess, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">block</span>, replication, addedNode, delNodeHint, blockplacement);</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>

进入chooseExcessReplicates方法

<code class="hljs php has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// first form a rack to datanodes map and</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 首先会形成机架对datanode节点的映射关系图</span>BlockCollection bc = getBlockCollection(b);<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> BlockStoragePolicy storagePolicy = storagePolicySuite.getPolicy(bc.getStoragePolicyID());<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">List</span><StorageType> excessTypes = storagePolicy.chooseExcess(replication, DatanodeStorageInfo.toStorageTypes(nonExcess));<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 初始化机架->节点列表映射图对象</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> Map<String, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">List</span><DatanodeStorageInfo>> rackMap= <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">new</span> HashMap<String, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">List</span><DatanodeStorageInfo>>();<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 超过1个副本数的节点列表</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">List</span><DatanodeStorageInfo> moreThanOne = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">new</span> ArrayList<DatanodeStorageInfo>();<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 恰好1个副本数的节点列表</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">List</span><DatanodeStorageInfo> exactlyOne = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">new</span> ArrayList<DatanodeStorageInfo>();</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li></ul>

为什么要划分不同节点列表的选择呢.因为在这里设计者做了优先选择,在同样拥有多余副本块的节点列表中,优先选择节点中副本数多于1个的,其次再是副本数恰好有1个的节点.这个设计很好理解,因为你上面的多余副本数更多嘛,我当然要先从多的开始删.

<code class="hljs cs has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 节点划分成对应2个集合</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// split nodes into two sets</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// moreThanOne contains nodes on rack with more than one replica</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// exactlyOne contains the remaining nodes</span>replicator.splitNodesWithRack(nonExcess, rackMap, moreThanOne, exactlyOne);</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li></ul>

进入划分方法

<code class="hljs php has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">  <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">public</span> void splitNodesWithRack(<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> Iterable<DatanodeStorageInfo> storages,<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> Map<String, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">List</span><DatanodeStorageInfo>> rackMap,<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">List</span><DatanodeStorageInfo> moreThanOne,<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">List</span><DatanodeStorageInfo> exactlyOne) {<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 遍历候选节点列表,形成机架->节点列表的对应关系</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span>(DatanodeStorageInfo s: storages) {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> String rackName = getRack(s.getDatanodeDescriptor());<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">List</span><DatanodeStorageInfo> storageList = rackMap.get(rackName);<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (storageList == <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">null</span>) {storageList = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">new</span> ArrayList<DatanodeStorageInfo>();rackMap.put(rackName, storageList);}storageList.add(s);}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li></ul>

下面给出的划分算法

<code class="hljs cs has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// split nodes into two sets</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span>(List<DatanodeStorageInfo> storageList : rackMap.values()) {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (storageList.size() == <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>) {<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// exactlyOne contains nodes on rack with only one replica</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 如果机架中对应的节点数量只有1个,则节点上副本数就为1,否则就为多个</span>exactlyOne.add(storageList.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">get</span>(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>));} <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span> {<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// moreThanOne contains nodes on rack with more than one replica</span>moreThanOne.addAll(storageList);}}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li></ul>

上面划分的原理应该是与对应的block副本存放策略原理相关,这个我到没有仔细去了解原因,读者可以自行阅读相关BlockPlacementPolicy相关代码进行了解.于是在这段代码过后,节点组就被分为了2大类,exactlyOne和moreThanOne.至此chooseExcessReplicates的上半段代码执行完毕,接下来看下半段代码的执行过程

<code class="hljs sql has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">    // 选择一个待删除的节点会偏向delNodeHintStorage的节点// pick one node to <span class="hljs-operator" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">delete</span> that favors the <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">delete</span> hint// 否在会从节点列表中选出一个可用空间最小// otherwise pick one <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">with</span> least <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">space</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> priSet <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> it <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">not</span> empty// otherwise one node <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">with</span> least <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">space</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> remainsboolean firstOne = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">true</span>;</span>final DatanodeStorageInfo delNodeHintStorage= DatanodeStorageInfo.getDatanodeStorageInfo(nonExcess, delNodeHint);final DatanodeStorageInfo addedNodeStorage= DatanodeStorageInfo.getDatanodeStorageInfo(nonExcess, addedNode);</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li></ul>

上面这3行注释传达出2个意思

  • 可以直接传入要删除的节点,如果可以,则优选选择传入的delHint节点
  • 在每个节点的内部列表中,优选会选择出可用空间最少的,这个也好理解,同样的副本数的节点列表中,当然要选择可用空间尽可能少的,以便释放出多的空间.
<code class="hljs java has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 如果目前过量副本所在节点数大于标准副本数,则进行循环移除</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> (nonExcess.size() - replication > <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>) {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> DatanodeStorageInfo cur;<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 判断是否可以使用delNodeHintStorage节点进行代替</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (useDelHint(firstOne, delNodeHintStorage, addedNodeStorage,moreThanOne, excessTypes)) {cur = delNodeHintStorage;} <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span> { <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// regular excessive replica removal</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 否则进行常规的节点选择</span>cur = replicator.chooseReplicaToDelete(bc, b, replication,moreThanOne, exactlyOne, excessTypes);}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li></ul>

判断是否可以使用delNodeHintStorage节点的判断逻辑这里就忽略了,主要看一下关键的chooseReplicaToDelete方法,这个分支处理才是最经常用到的处理方式.

<code class="hljs python has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">    // 选择的节点要么是心跳时间最老的或者是可用空间最少的// Pick the node <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">with</span> the oldest heartbeat <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">or</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">with</span> the least free space,// <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> all hearbeats are within the tolerable heartbeat interval<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span>(DatanodeStorageInfo storage : pickupReplicaSet(first, second)) {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (!excessTypes.contains(storage.getStorageType())) {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">continue</span>;}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li></ul>

first和second的节点选择逻辑如下,非常的简单

<code class="hljs applescript has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">  /*** Pick up replica node <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">set</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> deleting replica <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">as</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">over</span>-replicated. * First <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">set</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">contains</span> replica nodes <span class="hljs-function_start" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">on</span></span> rack <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">with</span> more than one* replica <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">second</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">set</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">contains</span> remaining replica nodes.* So pick up <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">first</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">set</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">not</span> empty. If <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">first</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> empty, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">then</span> pick <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">second</span>.*/protected Collection<DatanodeStorageInfo> pickupReplicaSet(Collection<DatanodeStorageInfo> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">first</span>,Collection<DatanodeStorageInfo> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">second</span>) {
<span class="hljs-command" style="box-sizing: border-box;">    return</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">first</span>.isEmpty() ? <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">second</span> : <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">first</span>;}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li></ul>

在节点列表的每次迭代循环中会进行下面2个指标的对比

<code class="hljs cpp has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">      <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 进行心跳时间的对比</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (lastHeartbeat < oldestHeartbeat) {oldestHeartbeat = lastHeartbeat;oldestHeartbeatStorage = storage;}<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 进行可用空间的对比</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (minSpace > <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">free</span>) {minSpace = <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">free</span>;minSpaceStorage = storage;}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li></ul>

然后进行选择,优先选择心跳时间最老的

<code class="hljs java has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> DatanodeStorageInfo storage;<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (oldestHeartbeatStorage != <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">null</span>) {storage = oldestHeartbeatStorage;} <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (minSpaceStorage != <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">null</span>) {storage = minSpaceStorage;} <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span> {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">null</span>;}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>

然后进行下面2个操作

<code class="hljs oxygene has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">      <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 重新进行rackMap对象关系的调整</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// adjust rackmap, moreThanOne, and exactlyOne</span>replicator.adjustSetsWithChosenReplica(rackMap, moreThanOne,exactlyOne, cur);<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 将选出的节点从候选节点列表中移除</span>nonExcess.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">remove</span>(cur);</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>

可以说,到了这里,多余副本块所在节点就被选出了.

多余副本块的处理


多余副本块的处理就显得很简单了,反正目标对象以及所在节点已经找到了,加入到相应的对象中即可.

<code class="hljs scss has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">      <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 加入到excessReplicateMap对象中</span><span class="hljs-function" style="box-sizing: border-box;">addToExcessReplicate(cur.<span class="hljs-function" style="box-sizing: border-box;">getDatanodeDescriptor()</span>, b)</span>;<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// The 'excessblocks' tracks blocks until we get confirmation</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// that the datanode has deleted them; the only way we remove them</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// is when we get a "removeBlock" message.  </span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// The 'invalidate' list is used to inform the datanode the block </span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// should be deleted.  Items are removed from the invalidate list</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// upon giving instructions to the namenode.</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 将此节点上的b block加入到无效节点中</span><span class="hljs-function" style="box-sizing: border-box;">addToInvalidates(b, cur.<span class="hljs-function" style="box-sizing: border-box;">getDatanodeDescriptor()</span>)</span>;</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li></ul>

加入到invalidates无效block列表后不久,此block就将被清除.

多余副本块清除的场景调用


重新回到之前提到过的多余副本块的3大场景调用.有人可能会好奇我是怎么找到这3种使用场景的,通过查看chooseExcessReplicates哪里被调用就可以了,如下图所示 

针对上述的5种调用情况,于是我总结了3种使用场景.下面一一进行对照.

场景1: ReCommission重新上线过程


在方法processOverReplicatedBlocksOnReCommission中调用了清除过量副本块的方法

<code class="hljs java has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">  <span class="hljs-javadoc" style="color: rgb(136, 0, 0); box-sizing: border-box;">/*** Stop decommissioning the specified datanode. *<span class="hljs-javadoctag" style="color: rgb(102, 0, 102); box-sizing: border-box;"> @param</span> node*/</span><span class="hljs-annotation" style="color: rgb(155, 133, 157); box-sizing: border-box;">@VisibleForTesting</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">public</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">void</span> <span class="hljs-title" style="box-sizing: border-box;">stopDecommission</span>(DatanodeDescriptor node) {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (node.isDecommissionInProgress() || node.isDecommissioned()) {<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// Update DN stats maintained by HeartbeatManager</span>hbManager.stopDecommission(node);<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// Over-replicated blocks will be detected and processed when</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// the dead node comes back and send in its full block report.</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (node.isAlive()) {blockManager.processOverReplicatedBlocksOnReCommission(node);}<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// Remove from tracking in DecommissionManager</span>pendingNodes.remove(node);decomNodeBlocks.remove(node);} <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span> {LOG.trace(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"stopDecommission: Node {} in {}, nothing to do."</span> +node, node.getAdminState());}}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li></ul>

下线操作重新恢复,必然要停止正在下线的动作,所以会在这个方法中进行调用.

场景2: SetReplication人为设置副本数


人为设置副本数是一个主动因素,调用的直接方法如下:

<code class="hljs java has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">  <span class="hljs-javadoc" style="color: rgb(136, 0, 0); box-sizing: border-box;">/** Set replication for the blocks. */</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">public</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">void</span> <span class="hljs-title" style="box-sizing: border-box;">setReplication</span>(<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">short</span> oldRepl, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">short</span> newRepl,<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> String src, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">final</span> Block... blocks) {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (newRepl == oldRepl) {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span>;}<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// update needReplication priority queues</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span>(Block b : blocks) {updateNeededReplications(b, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>, newRepl-oldRepl);}<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// 当设置的新的副本数值比原有的小的时候,需要进行过量副本的清除操作</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (oldRepl > newRepl) {<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// old replication > the new one; need to remove copies</span>LOG.info(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Decreasing replication from "</span> + oldRepl + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" to "</span> + newRepl+ <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" for "</span> + src);<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span>(Block b : blocks) {processOverReplicatedBlock(b, newRepl, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">null</span>, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">null</span>);}} <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span> { <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// replication factor is increased</span>LOG.info(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Increasing replication from "</span> + oldRepl + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" to "</span> + newRepl+ <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" for "</span> + src);}}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li></ul>

这个API方法是可以被外面的Client端程序调用触发的.

场景3: 丢失新添加的block记录信息


丢失新添加的block信息导致集群中存在多余的副本.官方的解释是这样的:

<code class="hljs applescript has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">  /** Since <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">the</span> BlocksMapGset <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">does</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">not</span> throw <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">the</span> ConcurrentModificationException* <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">and</span> supports further iteration <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">after</span> modification <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">to</span> <span class="hljs-type" style="box-sizing: border-box;">list</span>, there <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> a* chance <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">of</span> missing <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">the</span> newly added block <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> iterating. Since <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">every</span>* addition <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">to</span> blocksMap will check <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> mis-replication, missing mis-replication* check <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> new blocks will <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">not</span> be a problem.*/</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li></ul>

因为存在丢失block信息的可能性,所以会开单独的线程去重新检测是否存在过量副本的现象.

<code class="hljs r has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;">  private void processMisReplicatesAsync() throws InterruptedException {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">...</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> (namesystem.isRunning() && !Thread.currentThread().isInterrupted()) {int processed = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>;namesystem.writeLockInterruptibly();<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">try</span> {<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> (processed < numBlocksPerIteration && blocksItr.hasNext()) {BlockInfoContiguous block = blocksItr.next();MisReplicationResult res = processMisReplicatedBlock(block);<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">...</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li></ul>

场景4: 其他场景的检测


其他场景有的时候也会调用到processOverReplicatedBlock的方法,但不是外界的因素导致,而是出于一种谨慎性的考虑,比如在addStoredBlock,当新添加的块被加入到blockMap中时,会再次进行块的检测.还有1种是在文件最终写入完成的时候,也会调用一次checkReplication此方法,来确认集群中没有多余的相同块的情况.这2种情况的调用如上图所示,这里就不放出具体的代码了,可见,HDFS的设计者在细节方面的处理真的是很用心啊.

HDFS如何检测并删除多余副本块相关推荐

  1. 检测和删除多余无用的css

    本文主要讲解如何检测页面中多余无用的css. 1.chrome浏览器 F12审查元素的Audits 说明:使用Audits,会检测出页面中没有用到的css,需要手动删除多余的css:同时需要说明的是检 ...

  2. 编写fun函数判断字符串尾部的*号,若多于指定数量,则删除多余的;否则,不做操作

    <程序设计基础实训指导教程-c语言> ISBN 978-7-03-032846-5 p144 7.1.2 上级实训内容 [实训内容14]假定输入的字符串中只包含字母和" * &q ...

  3. java dom4j 去除空行_如何从XML文件中删除多余的空行?

    总之;我在XML文件中生成了很多空行,并且我正在寻找一种方法将它们作为一种倾斜文件的方式来删除它们.我怎样才能做到这一点 ?如何从XML文件中删除多余的空行? 有关详细说明,目前,我有这个XML文件: ...

  4. linux route命令删除多余路由

    添加到指定网络的路由 route add -net 192.168.100.0 netmask 255.255.255.0 gw 192.168.100.1 删除到指定网络的路由 route del ...

  5. 释疑の删除多余的ALV布局以及选择条件变式

    1.选择条件变式 在选择条件界面上的"转到"菜单中选择. 由于这个变式是所有用户都可以看到,所以会有这样的提示,是删除个人用户的,还是在集团内均删除. 2.ALV布局 由于无良的用 ...

  6. python文件处理:每隔一定数目删除;文件重命名;删除多余xml文件;将文件夹线所有文件平均分配到其他文件夹

    # -*- coding:utf-8 -*-''' fileName: createTime: modifyTime: description:written by donghao '''import ...

  7. mysql 删除多余帐号_安装完mysql数据库后的优化(删除多余用户和数据库)

    安装完mysql数据库后的优化(删除多余用户和数据库) 发布时间:2020-06-27 19:09:35 来源:51CTO 阅读:2761 作者:冰冻vs西瓜 栏目:数据库 1.查看数据库的版本信息: ...

  8. C语言九十四之请编写函数fun(char *str, int n),其功能是:使字符串str的前导*号不能多余n个,若多于n个,则删除多余的*号,若少于或等于n个,则不做处理。

    1.题目 规定输入的字符串中只包含字母和*号,请编写函数fun(char *str, int n),其功能是:使字符串str的前导*号不能多余n个,若多于n个,则删除多余的*号,若少于或等于n个,则不 ...

  9. 删除linux内核多余架构,删除多余Linux内核方法

    我使用的是Linux Mint,更新频繁,旧版本的Linux内核只会浪费硬盘空间,因此我会定期删除多余的内核. 查看已安装所有内核: sudo dpkg --get-selections |grep ...

  10. 如何删除多余系统引导项

    我们很多人都装过双系统,但是有时候装的当中却不想装了或者装不成功,生成的多余系统引导项怎么删除呢?下面分享下我的经验:win7(XP)下如何删除多余的系统引导项. 关键词:删除多余系统引导项,win7 ...

最新文章

  1. 两教授吐槽:如今博士研究生的论文写作水平为何如此堪忧?
  2. 三分钟撸完前后端crypto-js加解密,你学废了吗?
  3. linux里的网卡自动连接,【Raspberry Pi】USB无线网卡自动连接
  4. 唤起你对c#曾经的记忆
  5. 请问这样写法,第二个container的内容怎么没有显示的呢?但是加上jumbotron就可以显示了,不明白。...
  6. rs485数据线接反_rs485接口怎么接线?弱电人必学RS485接口基础知识讲解
  7. 《指数型组织》学习总结
  8. speedoffice(Word)文字怎么添加下划线
  9. ReactNative进阶(五十三):Keystore file ‘..android.keystore‘ not found for signing config ‘debug‘问题解决
  10. Blender建模练习:人物模型多边形建模流程图解(一核心布线篇)
  11. 继续理解socekt具体使用--2
  12. docker获取宿主机ip
  13. PAT乙级 ——开学寄语
  14. 做个什么网站可以赚钱,这6种网站最好赚钱!
  15. 要将OFD文件的base64编码转换为可下载的OFD文件
  16. springboot基于Java的电影院售票与管理系统毕业设计源码011449
  17. BeeCloud支付接入视频教程-黄君贤-专题视频课程
  18. 语音交互在车载场景中的应用
  19. 龙芯逸珑8089B安装debian8.5和KDE桌面
  20. 最小字符串(蓝桥杯试题 算法提高)

热门文章

  1. 阶段1 语言基础+高级_1-3-Java语言高级_05-异常与多线程_第4节 等待唤醒机制_1_线程状态概述...
  2. CSS外边距合并(塌陷/margin越界)
  3. 团队开发——冲刺1.e
  4. IOS设计模式学习(7)单例
  5. IE6之各种不适记录
  6. Docker 比较好的新入门教程
  7. Django+xadmin的安装与配置
  8. SqlDbx 个人版本使用指定的instant client
  9. 【HTML5+MVC4】xhEditor网页编辑器图片上传
  10. tomcat源码阅读