首先来看下HashMap的类描述

/*** Hash table based implementation of the <tt>Map</tt> interface.  This* implementation provides all of the optional map operations, and permits* <tt>null</tt> values and the <tt>null</tt> key.  (The <tt>HashMap</tt>* class is roughly equivalent to <tt>Hashtable</tt>, except that it is* unsynchronized and permits nulls.)  This class makes no guarantees as to* the order of the map; in particular, it does not guarantee that the order* will remain constant over time.** <p>This implementation provides constant-time performance for the basic* operations (<tt>get</tt> and <tt>put</tt>), assuming the hash function* disperses the elements properly among the buckets.  Iteration over* collection views requires time proportional to the "capacity" of the* <tt>HashMap</tt> instance (the number of buckets) plus its size (the number* of key-value mappings).  Thus, it's very important not to set the initial* capacity too high (or the load factor too low) if iteration performance is* important.** <p>An instance of <tt>HashMap</tt> has two parameters that affect its* performance: <i>initial capacity</i> and <i>load factor</i>.  The* <i>capacity</i> is the number of buckets in the hash table, and the initial* capacity is simply the capacity at the time the hash table is created.  The* <i>load factor</i> is a measure of how full the hash table is allowed to* get before its capacity is automatically increased.  When the number of* entries in the hash table exceeds the product of the load factor and the* current capacity, the hash table is <i>rehashed</i> (that is, internal data* structures are rebuilt) so that the hash table has approximately twice the* number of buckets.** <p>As a general rule, the default load factor (.75) offers a good* tradeoff between time and space costs.  Higher values decrease the* space overhead but increase the lookup cost (reflected in most of* the operations of the <tt>HashMap</tt> class, including* <tt>get</tt> and <tt>put</tt>).  The expected number of entries in* the map and its load factor should be taken into account when* setting its initial capacity, so as to minimize the number of* rehash operations.  If the initial capacity is greater than the* maximum number of entries divided by the load factor, no rehash* operations will ever occur.** <p>If many mappings are to be stored in a <tt>HashMap</tt>* instance, creating it with a sufficiently large capacity will allow* the mappings to be stored more efficiently than letting it perform* automatic rehashing as needed to grow the table.  Note that using* many keys with the same {@code hashCode()} is a sure way to slow* down performance of any hash table. To ameliorate impact, when keys* are {@link Comparable}, this class may use comparison order among* keys to help break ties.** <p><strong>Note that this implementation is not synchronized.</strong>* If multiple threads access a hash map concurrently, and at least one of* the threads modifies the map structurally, it <i>must</i> be* synchronized externally.  (A structural modification is any operation* that adds or deletes one or more mappings; merely changing the value* associated with a key that an instance already contains is not a* structural modification.)  This is typically accomplished by* synchronizing on some object that naturally encapsulates the map.** If no such object exists, the map should be "wrapped" using the* {@link Collections#synchronizedMap Collections.synchronizedMap}* method.  This is best done at creation time, to prevent accidental* unsynchronized access to the map:<pre>*   Map m = Collections.synchronizedMap(new HashMap(...));</pre>** <p>The iterators returned by all of this class's "collection view methods"* are <i>fail-fast</i>: if the map is structurally modified at any time after* the iterator is created, in any way except through the iterator's own* <tt>remove</tt> method, the iterator will throw a* {@link ConcurrentModificationException}.  Thus, in the face of concurrent* modification, the iterator fails quickly and cleanly, rather than risking* arbitrary, non-deterministic behavior at an undetermined time in the* future.** <p>Note that the fail-fast behavior of an iterator cannot be guaranteed* as it is, generally speaking, impossible to make any hard guarantees in the* presence of unsynchronized concurrent modification.  Fail-fast iterators* throw <tt>ConcurrentModificationException</tt> on a best-effort basis.* Therefore, it would be wrong to write a program that depended on this* exception for its correctness: <i>the fail-fast behavior of iterators* should be used only to detect bugs.</i>

总体来说，HashMap具有以下特性

key & value 可以为空
多线程不安全
大小为2的幂，就近取
负载因子 0.75 是权衡时间&空间的较好值泊松分布
初始容量默认 16
负载因子 & 初始容量 ——> 高效的查询和存储均匀分布数组扩容受其影响

** Implementation notes.** This map usually acts as a binned (bucketed) hash table, but* when bins get too large, they are transformed into bins of* TreeNodes, each structured similarly to those in* java.util.TreeMap. Most methods try to use normal bins, but* relay to TreeNode methods when applicable (simply by checking* instanceof a node).  Bins of TreeNodes may be traversed and* used like any others, but additionally support faster lookup* when overpopulated. However, since the vast majority of bins in* normal use are not overpopulated, checking for existence of* tree bins may be delayed in the course of table methods.** Tree bins (i.e., bins whose elements are all TreeNodes) are* ordered primarily by hashCode, but in the case of ties, if two* elements are of the same "class C implements Comparable<C>",* type then their compareTo method is used for ordering. (We* conservatively check generic types via reflection to validate* this -- see method comparableClassFor).  The added complexity* of tree bins is worthwhile in providing worst-case O(log n)* operations when keys either have distinct hashes or are* orderable, Thus, performance degrades gracefully under* accidental or malicious usages in which hashCode() methods* return values that are poorly distributed, as well as those in* which many keys share a hashCode, so long as they are also* Comparable. (If neither of these apply, we may waste about a* factor of two in time and space compared to taking no* precautions. But the only known cases stem from poor user* programming practices that are already so slow that this makes* little difference.)** Because TreeNodes are about twice the size of regular nodes, we* use them only when bins contain enough nodes to warrant use* (see TREEIFY_THRESHOLD). And when they become too small (due to* removal or resizing) they are converted back to plain bins.  In* usages with well-distributed user hashCodes, tree bins are* rarely used.  Ideally, under random hashCodes, the frequency of* nodes in bins follows a Poisson distribution* (http://en.wikipedia.org/wiki/Poisson_distribution) with a* parameter of about 0.5 on average for the default resizing* threshold of 0.75, although with a large variance because of* resizing granularity. Ignoring variance, the expected* occurrences of list size k are (exp(-0.5) * pow(0.5, k) /* factorial(k)). The first values are:** 0:    0.60653066* 1:    0.30326533* 2:    0.07581633* 3:    0.01263606* 4:    0.00157952* 5:    0.00015795* 6:    0.00001316* 7:    0.00000094* 8:    0.00000006* more: less than 1 in ten million** The root of a tree bin is normally its first node.  However,* sometimes (currently only upon Iterator.remove), the root might* be elsewhere, but can be recovered following parent links* (method TreeNode.root()).** All applicable internal methods accept a hash code as an* argument (as normally supplied from a public method), allowing* them to call each other without recomputing user hashCodes.* Most internal methods also accept a "tab" argument, that is* normally the current table, but may be a new or old one when* resizing or converting.** When bin lists are treeified, split, or untreeified, we keep* them in the same relative access/traversal order (i.e., field* Node.next) to better preserve locality, and to slightly* simplify handling of splits and traversals that invoke* iterator.remove. When using comparators on insertion, to keep a* total ordering (or as close as is required here) across* rebalancings, we compare classes and identityHashCodes as* tie-breakers.** The use and transitions among plain vs tree modes is* complicated by the existence of subclass LinkedHashMap. See* below for hook methods defined to be invoked upon insertion,* removal and access that allow LinkedHashMap internals to* otherwise remain independent of these mechanics. (This also* requires that a map instance be passed to some utility methods* that may create new nodes.)** The concurrent-programming-like SSA-based coding style helps* avoid aliasing errors amid all of the twisty pointer operations.*//*** The default initial capacity - MUST be a power of two.*/static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

HashMap常见面试题：

HashMap的底层数据结构？
HashMap的存取原理？
Java7和Java8的区别？（1.7 resize的时候，单链表的头插入方式，同一位置上新元素总会被放在链表的头部位置，在旧数组中同一条Entry链上的元素，通过重新计算索引位置后，有可能被放到了新数组的不同位置上。可能出现有环链表！而1.8采用尾插法避免此问题！）
为啥会线程不安全？(多线程情况最容易出现的就是：无法保证刚刚put的值，get的时候还是原值，所以线程安全还是无法保证。)
有什么线程安全的类代替么?
默认初始化大小是多少？为啥是这么多？为啥大小都是2的幂？（均匀分布 & 存取高效）
HashMap的扩容方式？负载因子是多少？为什么是这么多？
HashMap的主要参数都有哪些？
HashMap是怎么处理hash碰撞的？
hash的计算规则？
快速失败 & 安全失败？

底层数据结构：

1.8 数组和链表当链表的长度大于8时 ——> 红黑树提升查询效率

当链表的长度超过了默认阈值8的时就树形化

在treeifyBin(Node<K,V>[] tab, int hash)方法里面还要判断下 table 的 length 是否大于64，小于64是直接resize原数组长度的 2 倍。

看下Node

    /*** Basic hash bin node, used for most entries.  (See below for* TreeNode subclass, and in LinkedHashMap for its Entry subclass.)*/static class Node<K,V> implements Map.Entry<K,V> {final int hash;final K key;V value;Node<K,V> next;Node(int hash, K key, V value, Node<K,V> next) {this.hash = hash;this.key = key;this.value = value;this.next = next;}public final K getKey()        { return key; }public final V getValue()      { return value; }public final String toString() { return key + "=" + value; }public final int hashCode() {return Objects.hashCode(key) ^ Objects.hashCode(value);}public final V setValue(V newValue) {V oldValue = value;value = newValue;return oldValue;}public final boolean equals(Object o) {if (o == this)return true;if (o instanceof Map.Entry) {Map.Entry<?,?> e = (Map.Entry<?,?>)o;if (Objects.equals(key, e.getKey()) &&Objects.equals(value, e.getValue()))return true;}return false;}}

扩容

当元素个数达负载因子乘上初始容量时 2倍扩容将原有的数据重新hash存储到新的数组

线程安全可替代的类

SynchronizedMap、HashTable、ConcurrentHashMap（并发更高分段锁1.7，而1.8利用CAS和Synchronized来保证并发，内部虽然定义了segment，但仅仅是为了保证序列化时的兼容性！）

/*** Stripped-down version of helper class used in previous version,* declared for the sake of serialization compatibility.*/
static class Segment<K,V> extends ReentrantLock implements Serializable {final float loadFactor;Segment(float lf) { this.loadFactor = lf; }
}

hash碰撞处理

    /*** Computes key.hashCode() and spreads (XORs) higher bits of hash* to lower.  Because the table uses power-of-two masking, sets of* hashes that vary only in bits above the current mask will* always collide. (Among known examples are sets of Float keys* holding consecutive whole numbers in small tables.)  So we* apply a transform that spreads the impact of higher bits* downward. There is a tradeoff between speed, utility, and* quality of bit-spreading. Because many common sets of hashes* are already reasonably distributed (so don't benefit from* spreading), and because we use trees to handle large sets of* collisions in bins, we just XOR some shifted bits in the* cheapest possible way to reduce systematic lossage, as well as* to incorporate impact of the highest bits that would otherwise* never be used in index calculations because of table bounds.*/

static final int hash(Object key) {int h;return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

快速失败&安全失败？

    /*** An optimized version of AbstractList.Itr*/private class Itr implements Iterator<E> {int cursor;       // index of next element to returnint lastRet = -1; // index of last element returned; -1 if no suchint expectedModCount = modCount;public boolean hasNext() {return cursor != size;}@SuppressWarnings("unchecked")public E next() {checkForComodification();int i = cursor;if (i >= size)throw new NoSuchElementException();Object[] elementData = ArrayList.this.elementData;if (i >= elementData.length)throw new ConcurrentModificationException();cursor = i + 1;return (E) elementData[lastRet = i];}public void remove() {if (lastRet < 0)throw new IllegalStateException();checkForComodification();try {ArrayList.this.remove(lastRet);cursor = lastRet;lastRet = -1;expectedModCount = modCount;} catch (IndexOutOfBoundsException ex) {throw new ConcurrentModificationException();}}@Override@SuppressWarnings("unchecked")public void forEachRemaining(Consumer<? super E> consumer) {Objects.requireNonNull(consumer);final int size = ArrayList.this.size;int i = cursor;if (i >= size) {return;}final Object[] elementData = ArrayList.this.elementData;if (i >= elementData.length) {throw new ConcurrentModificationException();}while (i != size && modCount == expectedModCount) {consumer.accept((E) elementData[i++]);}// update once at end of iteration to reduce heap write trafficcursor = i;lastRet = i - 1;checkForComodification();}final void checkForComodification() {if (modCount != expectedModCount)throw new ConcurrentModificationException();}}

当通过remove移除HashMap中的一个元素时，会修改modCount值，其他修改HashMap集合的方法也会修改modCount值。该值在创建迭代器的时候，会赋值给expectedModCount，在迭代器工作的时候，会判定检查modCount值是否修改了。如果该值被修改了，则抛出ConcurrentModificationException异常。

//直接从hashtable增删数据就会报错。
//hashtable，hashmap等非并发集合，如果在迭代过程中增减了数据，会快速失败 (一检测到修改，马上抛异常) 修改为期望的值不会抛异常
//java.util.ConcurrentModificationException

Iterator的安全失败是基于对底层集合做拷贝，因此，它不受源集合上修改的影响。

采用安全失败机制的集合容器，在遍历时不是直接在集合内容上访问的，而是先复制原有集合内容，在拷贝的集合上进行遍历。

原理：由于迭代时是对原集合的拷贝进行遍历，所以在遍历过程中对原集合所作的修改并不能被迭代器检测到，所以不会触发Concurrent Modification Exception。

缺点：基于拷贝内容的优点是避免了Concurrent Modification Exception，但同样地，迭代器并不能访问到修改后的内容，即：迭代器遍历的是开始遍历那一刻拿到的集合拷贝，在遍历期间原集合发生的修改迭代器是不知道的。

场景：java.util.concurrent包下的容器都是安全失败，可以在多线程下并发使用，并发修改。

java.util包下面的所有的集合类都是快速失败的，而java.util.concurrent包下面的所有的类都是安全失败的。
快速失败的迭代器会抛出ConcurrentModificationException异常，而安全失败的迭代器永远不会抛出这样的异常。

集合遍历是使用Iterator, Iterator是工作在一个独立的线程中，且有一个互斥锁。Iterator 被创建之后会建立一个指向原来对象的单链索引表，当原来的对象数量发生变化时，这个索引表的内容不会同步改变，所以当索引指针往后移动的时候就找不到要迭代的对象，所以按照 fail-fast原则 Iterator 会马上抛出java.util.ConcurrentModificationException 异常。 Iterator 在工作的时候是不允许被迭代的对象被改变的。

`HashMap`的线程不安全主要体现在下面两个方面：

1.在JDK1.7中，当并发执行扩容操作时会造成环形链和数据丢失的情况。
2.在JDK1.8中，在并发执行put操作时会发生数据覆盖的情况、死循环。

浅析HashMap，何时树化？常见面试题解析相关推荐

c# 多线程执行事件并发_C#.NET Thread多线程并发编程学习与常见面试题解析-1、Thread使用与控制基础...
前言: 因为平时挺少用到多线程的,写游戏时都在用协程,至于协程那是另一个话题了,除了第一次学习多线程时和以前某个小项目有过就挺少有接触了,最近准备面试又怕被问的深入,所以就赶紧补补多线程基础. 网上已 ...
C#.NET Thread多线程并发编程学习与常见面试题解析-1、Thread使用与控制基础
前言: 因为平时挺少用到多线程的,写游戏时都在用协程,至于协程那是另一个话题了,除了第一次学习多线程时和以前某个小项目有过就挺少有接触了,最近准备面试又怕被问的深入,所以就赶紧补补多线程基础. 网上已 ...
HashMap的树化门槛为什么是8
网上主流的答案: 红黑树的平均查找长度是log(n),如果长度为8,平均查找长度为log(8)=3,链表的平均查找长度为n/2,当长度为8时,平均查找长度为8/2=4,红黑树的查找效率更高,这才有转换 ...
为什么HashMap要树化呢？
本质上这是个安全问题.因为在元素放置过程中,如果一个对象哈希冲突,都被放置到同一个桶里,则会形成一个链表,我们知道链表查询是线性的,会严重影响存取的性能. 而在现实世界,构造哈希冲突的数据并不是非常复 ...
mysql查找表中员工姓名性别_SQL 常见面试题解析
内容简介本文介绍并分析了 100 道常见 SQL 面试题,主要分为三个模块:SQL 初级查询.SQL 高级查询以及数据库设计与开发.文章内容结构如下图所示: 本文主要使用三个示例表:员工表(empl ...
MySQL常见面试题解析
1.drop,delete与truncate的区别相同点: truncate和不带where子句的delete,以及drop都会删除表内的数据不同点: truncate会清除表数据并重置id从1开 ...
java线程工作内存在栈中吗_JVM常见面试题解析
前言总结了JVM一些经典面试题,分享出我自己的解题思路,希望对大家有帮助,有哪里你觉得不正确的话,欢迎指出,后续有空会更新. 1.什么情况下会发生栈内存溢出. 思路: 描述栈定义,再描述为什么会溢出 ...
【搞定Jvm面试】 JVM 垃圾回收揭秘附常见面试题解析
JVM 垃圾回收写在前面本节常见面试题问题答案在文中都有提到如何判断对象是否死亡(两种方法). 简单的介绍一下强引用.软引用.弱引用.虚引用(虚引用与软引用和弱引用的区别.使用软引用能带来的好 ...
【搞定Jvm面试】 Java 内存区域揭秘附常见面试题解析
本文已经收录自笔者开源的 JavaGuide: https://github.com/Snailclimb ([Java学习面试指南] 一份涵盖大部分Java程序员所需要掌握的核心知识)如果觉得不错 ...

浅析HashMap，何时树化？常见面试题解析

首先来看下HashMap的类描述

总体来说，HashMap具有以下特性

HashMap常见面试题：

底层数据结构：

扩容

线程安全可替代的类

hash碰撞处理

快速失败&安全失败？

`HashMap`的线程不安全主要体现在下面两个方面：

浅析HashMap，何时树化？常见面试题解析相关推荐

最新文章

热门文章

浅析HashMap，何时树化？常见面试题解析

首先来看下HashMap的类描述

总体来说，HashMap具有以下特性

HashMap常见面试题：

底层数据结构：

扩容

线程安全可替代的类

hash碰撞处理

快速失败&安全失败？

HashMap的线程不安全主要体现在下面两个方面：

浅析HashMap，何时树化？常见面试题解析相关推荐

最新文章

热门文章

`HashMap`的线程不安全主要体现在下面两个方面：