死磕Java集合之BitSet源码分析(JDK18)

文章目录

  • 死磕Java集合之BitSet源码分析(JDK18)
    • 简介
    • 继承体系
    • 存储结构
    • 源码解析
      • 属性
      • 构造方法
      • set(int bitIndex)
      • set(int bitIndex, boolean value)
      • clear(int bitIndex)
      • set(int fromIndex, int toIndex)
      • clear(int fromIndex, int toIndex)
      • set(int fromIndex, int toIndex, boolean value)
      • flip(int bitIndex)
      • flip(int fromIndex, int toIndex)
      • cardinality()
    • 总结

简介

因为Java中没有new bit[]这种直接创建一个bit数组的方式,所以Java提供了BitSet来实现位图,BitSet是采用一个long型的数组来实现位图的。

BitSet的首个long型数组表示的是[0, 63]这64个元素。

继承体系

BitSet实现了Cloneable和Serializable

存储结构

BitSet通过long型的数组来实现位图,一个long型元素可以表示64个元素是否存在(第i位为1,表示64 * n + i已存在)。

源码解析

属性

/** BitSets are packed into arrays of "words."  Currently a word is* a long, which consists of 64 bits, requiring 6 address bits.* The choice of word size is determined purely by performance concerns.*/
private static final int ADDRESS_BITS_PER_WORD = 6;
// 1 << 6的大小是64
private static final int BITS_PER_WORD = 1 << ADDRESS_BITS_PER_WORD;
private static final int BIT_INDEX_MASK = BITS_PER_WORD - 1;/* Used to shift left or right for a partial word mask */
private static final long WORD_MASK = 0xffffffffffffffffL;/*** @serialField bits long[]** The bits in this BitSet.  The ith bit is stored in bits[i/64] at* bit position i % 64 (where bit position 0 refers to the least* significant bit and 63 refers to the most significant bit).*/
@java.io.Serial
private static final ObjectStreamField[] serialPersistentFields = {new ObjectStreamField("bits", long[].class),
};/*** The internal field corresponding to the serialField "bits".* BitSet的底层实现是使用long数组作为内部存储结构的,所以BitSet的大小为long类型大小(64位)的整数倍。*/
private long[] words;/*** The number of words in the logical size of this BitSet.*/
private transient int wordsInUse = 0;/*** Whether the size of "words" is user-specified.  If so, we assume* the user knows what he's doing and try harder to preserve it.*/
private transient boolean sizeIsSticky = false;/* use serialVersionUID from JDK 1.0.2 for interoperability */
@java.io.Serial
private static final long serialVersionUID = 7997698588986878753L;

构造方法

/*** Creates a new bit set. All bits are initially {@code false}.*/
public BitSet() {// 默认初始化long[]的长度为1initWords(BITS_PER_WORD);sizeIsSticky = false;
}/*** Creates a bit set whose initial size is large enough to explicitly* represent bits with indices in the range {@code 0} through* {@code nbits-1}. All bits are initially {@code false}.** @param  nbits the initial size of the bit set* @throws NegativeArraySizeException if the specified initial size*         is negative*/
public BitSet(int nbits) {// nbits can't be negative; size 0 is OKif (nbits < 0)throw new NegativeArraySizeException("nbits < 0: " + nbits);initWords(nbits);sizeIsSticky = true;
}private void initWords(int nbits) {words = new long[wordIndex(nbits-1) + 1];
}/*** Creates a bit set using words as the internal representation.* The last word (if there is one) must be non-zero.*/
private BitSet(long[] words) {this.words = words;this.wordsInUse = words.length;checkInvariants();
}// 相当于bitIndex / 64
private static int wordIndex(int bitIndex) {return bitIndex >> ADDRESS_BITS_PER_WORD;
}

set(int bitIndex)

将值bitIndex加入BitSet,即索引为bitIndex的元素比特位置为1

public void set(int bitIndex) {if (bitIndex < 0)throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);// 获取bitIndex所在long数组元素的索引int wordIndex = wordIndex(bitIndex);expandTo(wordIndex);// 将对应bitIndex的比特位置为1  words[wordIndex] |= (1L << bitIndex); // Restores invariantscheckInvariants();
}// 将long数组扩展到可以容纳索引wordIndex
private void expandTo(int wordIndex) {// 索引从0开始,所以需要的数组长度为wordIndex+1int wordsRequired = wordIndex+1;if (wordsInUse < wordsRequired) {ensureCapacity(wordsRequired);wordsInUse = wordsRequired;}
}// 确保BitSet的容量足够
private void ensureCapacity(int wordsRequired) {if (words.length < wordsRequired) {// Allocate larger of doubled size or required sizeint request = Math.max(2 * words.length, wordsRequired);words = Arrays.copyOf(words, request);sizeIsSticky = false;}
}// 新建一个newLength长度的数组,并将原数组的元素复制到新数组中,并返回新数组
public static long[] copyOf(long[] original, int newLength) {long[] copy = new long[newLength];System.arraycopy(original, 0, copy, 0,Math.min(original.length, newLength));return copy;
}

Java中的左移是循环位移,例如1 << 33相当于1 << 1值为2

set(int bitIndex, boolean value)

如果value为真,设置bitIndex位的值为1,否则则置为0

public void set(int bitIndex, boolean value) {if (value)set(bitIndex);elseclear(bitIndex);
}

clear(int bitIndex)

将bitIndex位的值置为0

public void clear(int bitIndex) {if (bitIndex < 0)throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);// 获取bitIndex所在的long数组的索引int wordIndex = wordIndex(bitIndex);if (wordIndex >= wordsInUse)return;// 1L << bitIndex找到对应的比特位,取反后再&,就将对应比特位置为0了words[wordIndex] &= ~(1L << bitIndex);recalculateWordsInUse();checkInvariants();
}// 重新计算wordsInUse
private void recalculateWordsInUse() {// Traverse the bitset until a used word is foundint i;for (i = wordsInUse-1; i >= 0; i--)if (words[i] != 0)break;wordsInUse = i+1; // The new logical size
}private void checkInvariants() {assert(wordsInUse == 0 || words[wordsInUse - 1] != 0);assert(wordsInUse >= 0 && wordsInUse <= words.length);assert(wordsInUse == words.length || words[wordsInUse] == 0);
}

set(int fromIndex, int toIndex)

将[fromIndex, toIndex)的比特位置为1

public void set(int fromIndex, int toIndex) {checkRange(fromIndex, toIndex);if (fromIndex == toIndex)return;// Increase capacity if necessaryint startWordIndex = wordIndex(fromIndex);int endWordIndex   = wordIndex(toIndex - 1);expandTo(endWordIndex);long firstWordMask = WORD_MASK << fromIndex;// 这里的WORD_MASK >>> -toIndex等价于WORD_MASK >>> 64-toIndexlong lastWordMask  = WORD_MASK >>> -toIndex;if (startWordIndex == endWordIndex) {// Case 1: One word// 第1种情况:fromIndex和toIndex都在同一个word中words[startWordIndex] |= (firstWordMask & lastWordMask);} else {// Case 2: Multiple words// Handle first wordwords[startWordIndex] |= firstWordMask;// Handle intermediate words, if anyfor (int i = startWordIndex+1; i < endWordIndex; i++)words[i] = WORD_MASK;// Handle last word (restores invariants)words[endWordIndex] |= lastWordMask;}checkInvariants();
}
  1. 检查参数是否合法;
  2. 分别获取fromIndex和toIndex所在的word的索引;
  3. 如果fromIndex和toIndex在同一个word中,则将[fromIndex, toIndex)的比特位置为1;
  4. 如果不在,则分三步处理:
    1. 处理第1个word;
    2. 处理第2个到最后1个之间的word,全置为1;
    3. 处理最后1个word。

clear(int fromIndex, int toIndex)

将[fromIndex, toIndex)的比特位置为0

public void clear(int fromIndex, int toIndex) {checkRange(fromIndex, toIndex);if (fromIndex == toIndex)return;int startWordIndex = wordIndex(fromIndex);if (startWordIndex >= wordsInUse)return;int endWordIndex = wordIndex(toIndex - 1);if (endWordIndex >= wordsInUse) {toIndex = length();endWordIndex = wordsInUse - 1;}long firstWordMask = WORD_MASK << fromIndex;// 这里的WORD_MASK >>> -toIndex等价于WORD_MASK >>> 64-toIndexlong lastWordMask  = WORD_MASK >>> -toIndex;if (startWordIndex == endWordIndex) {// Case 1: One word// 这里与set(fromIndex,toIndex)类似,只是后面进行了取反操作words[startWordIndex] &= ~(firstWordMask & lastWordMask);} else {// Case 2: Multiple words// Handle first wordwords[startWordIndex] &= ~firstWordMask;// Handle intermediate words, if anyfor (int i = startWordIndex+1; i < endWordIndex; i++)words[i] = 0;// Handle last wordwords[endWordIndex] &= ~lastWordMask;}recalculateWordsInUse();checkInvariants();
}

set(int fromIndex, int toIndex, boolean value)

如果value为真,将[fromIndex, toIndex)的比特位置为1,否则置为0

public void set(int fromIndex, int toIndex, boolean value) {if (value)set(fromIndex, toIndex);elseclear(fromIndex, toIndex);
}

flip(int bitIndex)

翻转bitIndex索引上的比特值

public void flip(int bitIndex) {if (bitIndex < 0)throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);int wordIndex = wordIndex(bitIndex);expandTo(wordIndex);// 对bitIndex上的比特值做异或操作words[wordIndex] ^= (1L << bitIndex);recalculateWordsInUse();checkInvariants();
}

flip(int fromIndex, int toIndex)

翻转[fromIndex, toIndex)的比特位

public void flip(int fromIndex, int toIndex) {checkRange(fromIndex, toIndex);if (fromIndex == toIndex)return;int startWordIndex = wordIndex(fromIndex);int endWordIndex   = wordIndex(toIndex - 1);expandTo(endWordIndex);long firstWordMask = WORD_MASK << fromIndex;long lastWordMask  = WORD_MASK >>> -toIndex;if (startWordIndex == endWordIndex) {// Case 1: One wordwords[startWordIndex] ^= (firstWordMask & lastWordMask);} else {// Case 2: Multiple words// Handle first wordwords[startWordIndex] ^= firstWordMask;// Handle intermediate words, if anyfor (int i = startWordIndex+1; i < endWordIndex; i++)words[i] ^= WORD_MASK;// Handle last wordwords[endWordIndex] ^= lastWordMask;}recalculateWordsInUse();checkInvariants();
}

cardinality()

计算整个BitSet中比特位为1的个数

public int cardinality() {int sum = 0;for (int i = 0; i < wordsInUse; i++)sum += Long.bitCount(words[i]);return sum;
}@IntrinsicCandidate
public static int bitCount(long i) {// HD, Figure 5-2i = i - ((i >>> 1) & 0x5555555555555555L);i = (i & 0x3333333333333333L) + ((i >>> 2) & 0x3333333333333333L);i = (i + (i >>> 4)) & 0x0f0f0f0f0f0f0f0fL;i = i + (i >>> 8);i = i + (i >>> 16);i = i + (i >>> 32);return (int)i & 0x7f;
}

总结

BitSet中的实现充分利用了位运算,速度很快,因为是位图,所以占用空间也比较小。

下面是boolean数组和BitSet的空间占用对比

public class ClassLayoutTest {public static void main(String[] args) {boolean[] bits = new boolean[1024];System.out.println(ClassLayout.parseInstance(bits).toPrintable());BitSet bitSet = new BitSet(1024);System.out.println(GraphLayout.parseInstance(bitSet).toPrintable());}
}[Z object internals:OFFSET  SIZE      TYPE DESCRIPTION                               VALUE0     4           (object header)                           01 00 00 00 (00000001 00000000 00000000 00000000) (1)4     4           (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)8     4           (object header)                           05 00 00 f8 (00000101 00000000 00000000 11111000) (-134217723)12     4           (object header)                           00 04 00 00 (00000000 00000100 00000000 00000000) (1024)16  1024   boolean [Z.<elements>                             N/A
Instance size: 1040 bytes
Space losses: 0 bytes internal + 0 bytes external = 0 bytes totaljava.util.BitSet@15615099d object externals:ADDRESS       SIZE TYPE             PATH                           VALUE716ccf4f8         24 java.util.BitSet                                (object)716ccf510        144 [J               .words                         [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

可以看出boolean数组,占用的空间是1040 bytes,BitSet占用的空间是24 + 144 = 168 bytes。

死磕Java集合之BitSet源码分析(JDK18)相关推荐

  1. 死磕 java集合之ArrayDeque源码分析

    问题 (1)什么是双端队列? (2)ArrayDeque是怎么实现双端队列的? (3)ArrayDeque是线程安全的吗? (4)ArrayDeque是有界的吗? 简介 双端队列是一种特殊的队列,它的 ...

  2. 【死磕 Java 集合】— LinkedTransferQueue源码分析

    [死磕 Java 集合]- LinkedTransferQueue源码分析 问题 (1)LinkedTransferQueue是什么东东? (2)LinkedTransferQueue是怎么实现阻塞队 ...

  3. java arraydeque_死磕 java集合之ArrayDeque源码分析

    问题 (1)什么是双端队列? (2)ArrayDeque是怎么实现双端队列的? (3)ArrayDeque是线程安全的吗? (4)ArrayDeque是有界的吗? 简介 双端队列是一种特殊的队列,它的 ...

  4. java ee是什么_死磕 java集合之HashSet源码分析

    问题 (1)集合(Collection)和集合(Set)有什么区别? (2)HashSet怎么保证添加元素不重复? (3)HashSet是否允许null元素? (4)HashSet是有序的吗? (5) ...

  5. 死磕 java集合之ConcurrentSkipListMap源码分析——发现个bug

    前情提要 点击链接查看"跳表"详细介绍. 拜托,面试别再问我跳表了! 简介 跳表是一个随机化的数据结构,实质就是一种可以进行二分查找的有序链表. 跳表在原有的有序链表上面增加了多级 ...

  6. java arraylist_死磕 java集合之ArrayList源码分析

    简介 ArrayList是一种以数组实现的List,与数组相比,它具有动态扩展的能力,因此也可称之为动态数组. 继承体系 ArrayList实现了List, RandomAccess, Cloneab ...

  7. 死磕 java集合之TreeMap源码分析(一)——红黑树全解析

    欢迎关注我的公众号"彤哥读源码",查看更多源码系列文章, 与彤哥一起畅游源码的海洋. 简介 TreeMap使用红黑树存储元素,可以保证元素按key值的大小进行遍历. 继承体系 Tr ...

  8. copyof java_死磕 java集合之CopyOnWriteArrayList源码分析

    简介 CopyOnWriteArrayList是ArrayList的线程安全版本,内部也是通过数组实现,每次对数组的修改都完全拷贝一份新的数组来修改,修改完了再替换掉老数组,这样保证了只阻塞写操作,不 ...

  9. 死磕 java集合之TreeMap源码分析(一)- 内含红黑树分析全过程

    欢迎关注我的公众号"彤哥读源码",查看更多源码系列文章, 与彤哥一起畅游源码的海洋. 简介 TreeMap使用红黑树存储元素,可以保证元素按key值的大小进行遍历. 继承体系 Tr ...

最新文章

  1. 奖学金申请 | 2019年清华-青岛数据科学研究院​“RONG”奖学金申请通知
  2. confluence7安全补丁_centos7安装confluence遇到的问题
  3. Java解码vhd的磁盘文件,VHD Java library
  4. Linux进程地址空间与进程内存布局详解,内核空间与用户空间
  5. asp使用mysql5.0_ASP使用MYSQL数据库全攻略
  6. RTSP、RTMP和HTTP协议的区别
  7. 【翻译】在Ext JS和Sencha Touch中创建自己定义布局
  8. Grid game CodeForces - 1104C 放格子|思维|找规律
  9. python比特币挖矿_比特币如何挖矿(挖矿原理)-工作量证明
  10. Khan_Computer Science_Algorithms
  11. unity打包IOS填坑1
  12. 【TWVRP】基于matalb粒子群算法求解带时间窗的车辆路径规划问题【含Matlab源码 1272期】
  13. 经济学中的同比环比,负增长,正增长
  14. 微信表情符号写入判决书,你发的OK、炸弹都可能成为“呈堂证供”
  15. 【前端笔记】Ant Design Form组件 resetFields() 与 setFieldsValue() 之比较
  16. 互联网日报 | 1月19日 星期二 | 腾讯音乐全资收购懒人听书;字节跳动整合硬件业务专注教育硬件;PSA与FCA正式完成合并...
  17. 流言粉碎机:JAVA使用 try catch会影响性能
  18. c语言欺凌,《中国校园欺凌调查报告》发布 语言欺凌占主导
  19. 雷达编程实战之信号处理流程
  20. 蠕虫病毒疯狂传播如何预防

热门文章

  1. word编辑过程中突然发现后面好几页消失不见
  2. 功能强大的图像标注抠图工具
  3. 工程师姓什么很重要!别再叫我 “X 工”!!!
  4. java高并发案例详细讲解
  5. 2021年烟花爆竹产品涉药考试题库及烟花爆竹产品涉药考试试卷
  6. app store账号改变地区
  7. Ubuntu下查看进程PID 终止进程方法汇总
  8. 高校计算机科学技术排名出炉,高校计算机科学技术排名出炉,上交大“无缘”A+,吉大表现亮眼...
  9. linux使用find命令快速查找文件
  10. 用vmware gsx做WINDOWS2000 / windows 2003 的群集