死磕Java集合之BitSet源码分析(JDK18)
死磕Java集合之BitSet源码分析(JDK18)
文章目录
- 死磕Java集合之BitSet源码分析(JDK18)
- 简介
- 继承体系
- 存储结构
- 源码解析
- 属性
- 构造方法
- set(int bitIndex)
- set(int bitIndex, boolean value)
- clear(int bitIndex)
- set(int fromIndex, int toIndex)
- clear(int fromIndex, int toIndex)
- set(int fromIndex, int toIndex, boolean value)
- flip(int bitIndex)
- flip(int fromIndex, int toIndex)
- cardinality()
- 总结
简介
因为Java中没有new bit[]
这种直接创建一个bit数组的方式,所以Java提供了BitSet来实现位图,BitSet是采用一个long型的数组来实现位图的。
BitSet的首个long型数组表示的是[0, 63]
这64个元素。
继承体系
BitSet实现了Cloneable和Serializable
存储结构
BitSet通过long型的数组来实现位图,一个long型元素可以表示64个元素是否存在(第i位为1,表示64 * n + i已存在)。
源码解析
属性
/** BitSets are packed into arrays of "words." Currently a word is* a long, which consists of 64 bits, requiring 6 address bits.* The choice of word size is determined purely by performance concerns.*/
private static final int ADDRESS_BITS_PER_WORD = 6;
// 1 << 6的大小是64
private static final int BITS_PER_WORD = 1 << ADDRESS_BITS_PER_WORD;
private static final int BIT_INDEX_MASK = BITS_PER_WORD - 1;/* Used to shift left or right for a partial word mask */
private static final long WORD_MASK = 0xffffffffffffffffL;/*** @serialField bits long[]** The bits in this BitSet. The ith bit is stored in bits[i/64] at* bit position i % 64 (where bit position 0 refers to the least* significant bit and 63 refers to the most significant bit).*/
@java.io.Serial
private static final ObjectStreamField[] serialPersistentFields = {new ObjectStreamField("bits", long[].class),
};/*** The internal field corresponding to the serialField "bits".* BitSet的底层实现是使用long数组作为内部存储结构的,所以BitSet的大小为long类型大小(64位)的整数倍。*/
private long[] words;/*** The number of words in the logical size of this BitSet.*/
private transient int wordsInUse = 0;/*** Whether the size of "words" is user-specified. If so, we assume* the user knows what he's doing and try harder to preserve it.*/
private transient boolean sizeIsSticky = false;/* use serialVersionUID from JDK 1.0.2 for interoperability */
@java.io.Serial
private static final long serialVersionUID = 7997698588986878753L;
构造方法
/*** Creates a new bit set. All bits are initially {@code false}.*/
public BitSet() {// 默认初始化long[]的长度为1initWords(BITS_PER_WORD);sizeIsSticky = false;
}/*** Creates a bit set whose initial size is large enough to explicitly* represent bits with indices in the range {@code 0} through* {@code nbits-1}. All bits are initially {@code false}.** @param nbits the initial size of the bit set* @throws NegativeArraySizeException if the specified initial size* is negative*/
public BitSet(int nbits) {// nbits can't be negative; size 0 is OKif (nbits < 0)throw new NegativeArraySizeException("nbits < 0: " + nbits);initWords(nbits);sizeIsSticky = true;
}private void initWords(int nbits) {words = new long[wordIndex(nbits-1) + 1];
}/*** Creates a bit set using words as the internal representation.* The last word (if there is one) must be non-zero.*/
private BitSet(long[] words) {this.words = words;this.wordsInUse = words.length;checkInvariants();
}// 相当于bitIndex / 64
private static int wordIndex(int bitIndex) {return bitIndex >> ADDRESS_BITS_PER_WORD;
}
set(int bitIndex)
将值bitIndex加入BitSet,即索引为bitIndex的元素比特位置为1
public void set(int bitIndex) {if (bitIndex < 0)throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);// 获取bitIndex所在long数组元素的索引int wordIndex = wordIndex(bitIndex);expandTo(wordIndex);// 将对应bitIndex的比特位置为1 words[wordIndex] |= (1L << bitIndex); // Restores invariantscheckInvariants();
}// 将long数组扩展到可以容纳索引wordIndex
private void expandTo(int wordIndex) {// 索引从0开始,所以需要的数组长度为wordIndex+1int wordsRequired = wordIndex+1;if (wordsInUse < wordsRequired) {ensureCapacity(wordsRequired);wordsInUse = wordsRequired;}
}// 确保BitSet的容量足够
private void ensureCapacity(int wordsRequired) {if (words.length < wordsRequired) {// Allocate larger of doubled size or required sizeint request = Math.max(2 * words.length, wordsRequired);words = Arrays.copyOf(words, request);sizeIsSticky = false;}
}// 新建一个newLength长度的数组,并将原数组的元素复制到新数组中,并返回新数组
public static long[] copyOf(long[] original, int newLength) {long[] copy = new long[newLength];System.arraycopy(original, 0, copy, 0,Math.min(original.length, newLength));return copy;
}
Java中的左移是循环位移,例如1 << 33
相当于1 << 1
值为2
set(int bitIndex, boolean value)
如果value为真,设置bitIndex位的值为1,否则则置为0
public void set(int bitIndex, boolean value) {if (value)set(bitIndex);elseclear(bitIndex);
}
clear(int bitIndex)
将bitIndex位的值置为0
public void clear(int bitIndex) {if (bitIndex < 0)throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);// 获取bitIndex所在的long数组的索引int wordIndex = wordIndex(bitIndex);if (wordIndex >= wordsInUse)return;// 1L << bitIndex找到对应的比特位,取反后再&,就将对应比特位置为0了words[wordIndex] &= ~(1L << bitIndex);recalculateWordsInUse();checkInvariants();
}// 重新计算wordsInUse
private void recalculateWordsInUse() {// Traverse the bitset until a used word is foundint i;for (i = wordsInUse-1; i >= 0; i--)if (words[i] != 0)break;wordsInUse = i+1; // The new logical size
}private void checkInvariants() {assert(wordsInUse == 0 || words[wordsInUse - 1] != 0);assert(wordsInUse >= 0 && wordsInUse <= words.length);assert(wordsInUse == words.length || words[wordsInUse] == 0);
}
set(int fromIndex, int toIndex)
将[fromIndex, toIndex)的比特位置为1
public void set(int fromIndex, int toIndex) {checkRange(fromIndex, toIndex);if (fromIndex == toIndex)return;// Increase capacity if necessaryint startWordIndex = wordIndex(fromIndex);int endWordIndex = wordIndex(toIndex - 1);expandTo(endWordIndex);long firstWordMask = WORD_MASK << fromIndex;// 这里的WORD_MASK >>> -toIndex等价于WORD_MASK >>> 64-toIndexlong lastWordMask = WORD_MASK >>> -toIndex;if (startWordIndex == endWordIndex) {// Case 1: One word// 第1种情况:fromIndex和toIndex都在同一个word中words[startWordIndex] |= (firstWordMask & lastWordMask);} else {// Case 2: Multiple words// Handle first wordwords[startWordIndex] |= firstWordMask;// Handle intermediate words, if anyfor (int i = startWordIndex+1; i < endWordIndex; i++)words[i] = WORD_MASK;// Handle last word (restores invariants)words[endWordIndex] |= lastWordMask;}checkInvariants();
}
- 检查参数是否合法;
- 分别获取fromIndex和toIndex所在的word的索引;
- 如果fromIndex和toIndex在同一个word中,则将[fromIndex, toIndex)的比特位置为1;
- 如果不在,则分三步处理:
- 处理第1个word;
- 处理第2个到最后1个之间的word,全置为1;
- 处理最后1个word。
clear(int fromIndex, int toIndex)
将[fromIndex, toIndex)的比特位置为0
public void clear(int fromIndex, int toIndex) {checkRange(fromIndex, toIndex);if (fromIndex == toIndex)return;int startWordIndex = wordIndex(fromIndex);if (startWordIndex >= wordsInUse)return;int endWordIndex = wordIndex(toIndex - 1);if (endWordIndex >= wordsInUse) {toIndex = length();endWordIndex = wordsInUse - 1;}long firstWordMask = WORD_MASK << fromIndex;// 这里的WORD_MASK >>> -toIndex等价于WORD_MASK >>> 64-toIndexlong lastWordMask = WORD_MASK >>> -toIndex;if (startWordIndex == endWordIndex) {// Case 1: One word// 这里与set(fromIndex,toIndex)类似,只是后面进行了取反操作words[startWordIndex] &= ~(firstWordMask & lastWordMask);} else {// Case 2: Multiple words// Handle first wordwords[startWordIndex] &= ~firstWordMask;// Handle intermediate words, if anyfor (int i = startWordIndex+1; i < endWordIndex; i++)words[i] = 0;// Handle last wordwords[endWordIndex] &= ~lastWordMask;}recalculateWordsInUse();checkInvariants();
}
set(int fromIndex, int toIndex, boolean value)
如果value为真,将[fromIndex, toIndex)的比特位置为1,否则置为0
public void set(int fromIndex, int toIndex, boolean value) {if (value)set(fromIndex, toIndex);elseclear(fromIndex, toIndex);
}
flip(int bitIndex)
翻转bitIndex索引上的比特值
public void flip(int bitIndex) {if (bitIndex < 0)throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);int wordIndex = wordIndex(bitIndex);expandTo(wordIndex);// 对bitIndex上的比特值做异或操作words[wordIndex] ^= (1L << bitIndex);recalculateWordsInUse();checkInvariants();
}
flip(int fromIndex, int toIndex)
翻转[fromIndex, toIndex)的比特位
public void flip(int fromIndex, int toIndex) {checkRange(fromIndex, toIndex);if (fromIndex == toIndex)return;int startWordIndex = wordIndex(fromIndex);int endWordIndex = wordIndex(toIndex - 1);expandTo(endWordIndex);long firstWordMask = WORD_MASK << fromIndex;long lastWordMask = WORD_MASK >>> -toIndex;if (startWordIndex == endWordIndex) {// Case 1: One wordwords[startWordIndex] ^= (firstWordMask & lastWordMask);} else {// Case 2: Multiple words// Handle first wordwords[startWordIndex] ^= firstWordMask;// Handle intermediate words, if anyfor (int i = startWordIndex+1; i < endWordIndex; i++)words[i] ^= WORD_MASK;// Handle last wordwords[endWordIndex] ^= lastWordMask;}recalculateWordsInUse();checkInvariants();
}
cardinality()
计算整个BitSet中比特位为1的个数
public int cardinality() {int sum = 0;for (int i = 0; i < wordsInUse; i++)sum += Long.bitCount(words[i]);return sum;
}@IntrinsicCandidate
public static int bitCount(long i) {// HD, Figure 5-2i = i - ((i >>> 1) & 0x5555555555555555L);i = (i & 0x3333333333333333L) + ((i >>> 2) & 0x3333333333333333L);i = (i + (i >>> 4)) & 0x0f0f0f0f0f0f0f0fL;i = i + (i >>> 8);i = i + (i >>> 16);i = i + (i >>> 32);return (int)i & 0x7f;
}
总结
BitSet中的实现充分利用了位运算,速度很快,因为是位图,所以占用空间也比较小。
下面是boolean数组和BitSet的空间占用对比
public class ClassLayoutTest {public static void main(String[] args) {boolean[] bits = new boolean[1024];System.out.println(ClassLayout.parseInstance(bits).toPrintable());BitSet bitSet = new BitSet(1024);System.out.println(GraphLayout.parseInstance(bitSet).toPrintable());}
}[Z object internals:OFFSET SIZE TYPE DESCRIPTION VALUE0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)8 4 (object header) 05 00 00 f8 (00000101 00000000 00000000 11111000) (-134217723)12 4 (object header) 00 04 00 00 (00000000 00000100 00000000 00000000) (1024)16 1024 boolean [Z.<elements> N/A
Instance size: 1040 bytes
Space losses: 0 bytes internal + 0 bytes external = 0 bytes totaljava.util.BitSet@15615099d object externals:ADDRESS SIZE TYPE PATH VALUE716ccf4f8 24 java.util.BitSet (object)716ccf510 144 [J .words [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
可以看出boolean数组,占用的空间是1040 bytes,BitSet占用的空间是24 + 144 = 168 bytes。
死磕Java集合之BitSet源码分析(JDK18)相关推荐
- 死磕 java集合之ArrayDeque源码分析
问题 (1)什么是双端队列? (2)ArrayDeque是怎么实现双端队列的? (3)ArrayDeque是线程安全的吗? (4)ArrayDeque是有界的吗? 简介 双端队列是一种特殊的队列,它的 ...
- 【死磕 Java 集合】— LinkedTransferQueue源码分析
[死磕 Java 集合]- LinkedTransferQueue源码分析 问题 (1)LinkedTransferQueue是什么东东? (2)LinkedTransferQueue是怎么实现阻塞队 ...
- java arraydeque_死磕 java集合之ArrayDeque源码分析
问题 (1)什么是双端队列? (2)ArrayDeque是怎么实现双端队列的? (3)ArrayDeque是线程安全的吗? (4)ArrayDeque是有界的吗? 简介 双端队列是一种特殊的队列,它的 ...
- java ee是什么_死磕 java集合之HashSet源码分析
问题 (1)集合(Collection)和集合(Set)有什么区别? (2)HashSet怎么保证添加元素不重复? (3)HashSet是否允许null元素? (4)HashSet是有序的吗? (5) ...
- 死磕 java集合之ConcurrentSkipListMap源码分析——发现个bug
前情提要 点击链接查看"跳表"详细介绍. 拜托,面试别再问我跳表了! 简介 跳表是一个随机化的数据结构,实质就是一种可以进行二分查找的有序链表. 跳表在原有的有序链表上面增加了多级 ...
- java arraylist_死磕 java集合之ArrayList源码分析
简介 ArrayList是一种以数组实现的List,与数组相比,它具有动态扩展的能力,因此也可称之为动态数组. 继承体系 ArrayList实现了List, RandomAccess, Cloneab ...
- 死磕 java集合之TreeMap源码分析(一)——红黑树全解析
欢迎关注我的公众号"彤哥读源码",查看更多源码系列文章, 与彤哥一起畅游源码的海洋. 简介 TreeMap使用红黑树存储元素,可以保证元素按key值的大小进行遍历. 继承体系 Tr ...
- copyof java_死磕 java集合之CopyOnWriteArrayList源码分析
简介 CopyOnWriteArrayList是ArrayList的线程安全版本,内部也是通过数组实现,每次对数组的修改都完全拷贝一份新的数组来修改,修改完了再替换掉老数组,这样保证了只阻塞写操作,不 ...
- 死磕 java集合之TreeMap源码分析(一)- 内含红黑树分析全过程
欢迎关注我的公众号"彤哥读源码",查看更多源码系列文章, 与彤哥一起畅游源码的海洋. 简介 TreeMap使用红黑树存储元素,可以保证元素按key值的大小进行遍历. 继承体系 Tr ...
最新文章
- 奖学金申请 | 2019年清华-青岛数据科学研究院​“RONG”奖学金申请通知
- confluence7安全补丁_centos7安装confluence遇到的问题
- Java解码vhd的磁盘文件,VHD Java library
- Linux进程地址空间与进程内存布局详解,内核空间与用户空间
- asp使用mysql5.0_ASP使用MYSQL数据库全攻略
- RTSP、RTMP和HTTP协议的区别
- 【翻译】在Ext JS和Sencha Touch中创建自己定义布局
- Grid game CodeForces - 1104C 放格子|思维|找规律
- python比特币挖矿_比特币如何挖矿(挖矿原理)-工作量证明
- Khan_Computer Science_Algorithms
- unity打包IOS填坑1
- 【TWVRP】基于matalb粒子群算法求解带时间窗的车辆路径规划问题【含Matlab源码 1272期】
- 经济学中的同比环比,负增长,正增长
- 微信表情符号写入判决书,你发的OK、炸弹都可能成为“呈堂证供”
- 【前端笔记】Ant Design Form组件 resetFields() 与 setFieldsValue() 之比较
- 互联网日报 | 1月19日 星期二 | 腾讯音乐全资收购懒人听书;字节跳动整合硬件业务专注教育硬件;PSA与FCA正式完成合并...
- 流言粉碎机:JAVA使用 try catch会影响性能
- c语言欺凌,《中国校园欺凌调查报告》发布 语言欺凌占主导
- 雷达编程实战之信号处理流程
- 蠕虫病毒疯狂传播如何预防
热门文章
- word编辑过程中突然发现后面好几页消失不见
- 功能强大的图像标注抠图工具
- 工程师姓什么很重要!别再叫我 “X 工”!!!
- java高并发案例详细讲解
- 2021年烟花爆竹产品涉药考试题库及烟花爆竹产品涉药考试试卷
- app store账号改变地区
- Ubuntu下查看进程PID 终止进程方法汇总
- 高校计算机科学技术排名出炉,高校计算机科学技术排名出炉,上交大“无缘”A+,吉大表现亮眼...
- linux使用find命令快速查找文件
- 用vmware gsx做WINDOWS2000 / windows 2003 的群集