/** * Hash table based implementation of the <tt>Map</tt> interface. This * implementation provides all of the optional map operations, and permits * <tt>null</tt> values and the <tt>null</tt> key. (The <tt>HashMap</tt> * class is roughly equivalent to <tt>Hashtable</tt>, except that it is * unsynchronized and permits nulls.) This class makes no guarantees as to * the order of the map; in particular, it does not guarantee that the order * will remain constant over time. * 👆散列表也基于对 Map 接口的实现。HashMap 也是对是 Map 接口的实现,提供了所有对 Map 的可选操作(的方法),并且允许 null 值和 null 键。HashMap 大致相当与 HashTable 相同,除了 HashMap 是非同步的,且接受 null。该类不保证数据对象在内部的顺序,同时也不保证内部的顺序是一成不变的。 * * <p>This implementation provides constant-time performance for the basic * operations (<tt>get</tt> and <tt>put</tt>), assuming the hash function * disperses the elements properly among the buckets. Iteration over * collection views requires time proportional to the "capacity" of the * <tt>HashMap</tt> instance (the number of buckets) plus its size (the number * of key-value mappings). Thus, it's very important not to set the initial * capacity too high (or the load factor too low) if iteration performance * is important. * 👆HashMap 中 get 和 put 这两项基本操作提供复杂度为常数级别的性能表现,假定 hash 方法把元素均匀的分配到桶中。对整个集合视图进行迭代所需的时间与 HashMap 实例(桶的数量)乘以它的容量(键值对的数量)之积。所以,如果迭代性能很重要的话,不要在初始化 HashMap 的时候设置过高的容量,及太小的负载系数(负载因子) * * <p>As a general rule, the default load factor (.75) offers a good tradeoff * between time and space costs. Higher values decrease the space overhead * but increase the lookup cost (reflected in most of the operations of the * <tt>HashMap</tt> class, including <tt>get</tt> and <tt>put</tt>). The * expected number of entries in the map and its load factor should be taken * into account when setting its initial capacity, so as to minimize the * number of rehash operations. If the initial capacity is greater * than the maximum number of entries divided by the load factor, no * rehash operations will ever occur. * 👆一般,默认的负载系数(0.75)在时间和空间成本之间提供了很好的平衡。更高的值会减少了空间开销,但同时会增加查找成本(反映在 Hashmap 的大多数操作中,包括 get 和 put)。在设置初始容量时,应考虑到 Map 中预期的条目数量及其负载系数,以尽量减少 rehash 的次数。如果初始容量大于最大条目数除以负载系数的值,则不会发生 rehash 操作。 * <p>If many mappings are to be stored in a <tt>HashMap</tt> instance, * creating it with a sufficiently large capacity will allow the mappings to * be stored more efficiently than letting it perform automatic rehashing as * needed to grow the table. * 👆如果要在一个 HashMap 实例中存储多个键值对,那么创建实例时指定足够大的容量将比让它根据需要 rehash 以扩充容量的效率更高。 * <strong>Note that this implementation is not synchronized.</strong> * If multiple threads access a hash map concurrently, and at least one of * the threads modifies the map structurally, it <i>must</i> be * synchronized externally. (A structural modification is any operation * that adds or deletes one or more mappings; merely changing the value * associated with a key that an instance already contains is not a * structural modification.) This is typically accomplished by * synchronizing on some object that naturally encapsulates the map. * 👆需要注意的是,HashMap 是非线程安全的。如果多线程同时请求访问 HashMap,并且有至少一条线程改变了 HashMap 实例的结构,那么必须在 HashMap 外层对它添加 synchronized 修饰。(结构修改是指任何添加或删除一个或多个键值对的操作;仅更改实例已包含的键关联的值不是结构修改。)(这一句不太懂诶。。。) * If no such object exists, the map should be "wrapped" using the * {@link Collections#synchronizedMap Collections.synchronizedMap} * method. This is best done at creation time, to prevent accidental * unsynchronized access to the map:<pre> * Map m = Collections.synchronizedMap(new HashMap(...));</pre> * 👆如果这样的对象不存在,则该 HashMap 应该被Collections.synchronizedMap()方法包裹起来。最好在对象初创建的时候就这样做,以免发生不可预知的对 HashMap 的访问。 * <p>The iterators returned by all of this class's "collection view methods" * are <i>fail-fast</i>: if the map is structurally modified at any time after * the iterator is created, in any way except through the iterator's own * <tt>remove</tt> method, the iterator will throw a * {@link ConcurrentModificationException}. Thus, in the face of concurrent * modification, the iterator fails quickly and cleanly, rather than risking * arbitrary, non-deterministic behavior at an undetermined time in the * future. * 👆HashMap 的迭代器返回的这个类的所有集合视图方法都是‘快速失效’的: 所有在迭代器创建之后对 HashMap 进行结构性修改,除了迭代器自己的 remove()方法外,会抛出 ConcurrentModificationException。在面对并发对 HashMap 的修改时,迭代器会快速彻底的失效,而不会选择在之后冒不确定的、未知的风险。 * <p>Note that the fail-fast behavior of an iterator cannot be guaranteed * as it is, generally speaking, impossible to make any hard guarantees in the * presence of unsynchronized concurrent modification. Fail-fast iterators * throw <tt>ConcurrentModificationException</tt> on a best-effort basis. * Therefore, it would be wrong to write a program that depended on this * exception for its correctness: <i>the fail-fast behavior of iterators * should be used only to detect bugs.</i> * 👆需要注意的是,迭代器的快速失效并不能被保证(一定发生),,通常来讲,在非同步并发修改 HashMap 的情况下,无法(对快速失效)做出硬性保证。迭代器快速失效后会尽最大努力抛出 ConcurrentModificationException 异常。因此,不能依靠是否抛出该异常来确定写的程序是否是正确的:迭代器的快速失效行为只能被用来查明bug。 */ publicclassHashMap<K,V> extendsAbstractMap<K,V> implementsMap<K,V>, Cloneable, Serializable
/** * The default initial capacity - MUST be a power of two. * HashMap 的默认初始容量--必须是 2 的 N 次方 */ staticfinalintDEFAULT_INITIAL_CAPACITY=1 << 4; // aka 16 (Also Known As, 即初始值为 16)
/** * The maximum capacity, used if a higher value is implicitly specified * by either of the constructors with arguments. * MUST be a power of two <= 1<<30. * 最大容量,在带参构造函数中隐式判断指定的容量是否超出最大容量时使用。 * 必须是 2 的 N 次方,且不大于 2^30 */ staticfinalintMAXIMUM_CAPACITY=1 << 30;
/** * The load factor used when none specified in constructor. * 构造器未指定负载系数时,使用该默认值 * (当 HashMap 的 size 大于 HashMap 的 capacity * loadFactor 时,HashMap 扩容,即 capacity *= 2,并 rehash()) */ staticfinalfloatDEFAULT_LOAD_FACTOR=0.75f; /** * An empty table instance to share when the table is not inflated. * 在表还没有扩容之前,共享该空数组实例 */ staticfinal Entry<?,?>[] EMPTY_TABLE = {};
/** * The table, resized as necessary. Length MUST Always be a power of two. * 整个表的容量,如果有必要的话(键值对条数/负载系数 >= table.length)会进行扩容。容量大小必须是 2 的幂次方 */ transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;
/** * The number of key-value mappings contained in this map. * map 中键值对的条数 */ transientint size;
/** * The next size value at which to resize (capacity * load factor). * @serial * 下一次扩容后的大小(容量*负载系数) */ int threshold;
/** * The load factor for the hash table. * table 的负载系数 * @serial */ finalfloat loadFactor;
/** * This method is invoked whenever the value in an entry is * overwritten by an invocation of put(k,v) for a key k that's already * in the HashMap. */ voidrecordAccess(HashMap<K,V> m) { }
/** * This method is invoked whenever the entry is * removed from the table. */ voidrecordRemoval(HashMap<K,V> m) { } }
/** * Returns index for hash code h. * 找到 hash 值在 table 中的索引值 */ staticintindexFor(int h, int length) { // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2"; //Integer.bitCount(length) 用于计算十进制 length 值的二进制值中 '1' 的个数;Integer.bitCount(length) == 1,即断言 length 值(HashMap 中Entry<?,?>[]的length,即 HashMap capacity)为 2 的幂次方。 // 刚开始是没有看明白这行代码的,求助搜索引擎之后,才有些眉目。 // 首先,这是按位与操作,,因为 length 值为 2 的幂次方,所以 length - 1后,其二进制的值除高位外全为1(eg. 假设length 值为 8(二进制表现为 1000), 减 1 得 7;7 的二进制为 0111) return h & (length-1); }
/** * Adds a new entry with the specified key, value and hash code to * the specified bucket. It is the responsibility of this * method to resize the table if appropriate. * 通过指定的 key、value、hash code 向指定的 bucket 添加一个新的 entry 对象。在适当的时候扩容 table 也是该方法的主要责任。 * Subclass overrides this to alter the behavior of put method. * 子类重写该方法可以修改 put() 方法的行为。 */ voidaddEntry(int hash, K key, V value, int bucketIndex) { // 如果当前 HashMap 的 size 大于等于扩容临界值,且当前 Entry 不为 null 的时候 // 当 table 的 capacity 等于 Integer.MAX_VALUE时,int 类型的 size 最多与 threshold(此时 threshold = Integer.MAX_VALUE)相等,而且 null != table[bucketIndex] 肯定为 true。所以,当 table 达到最大容量后,再调用 put() 会一直覆盖原有的值。 if ((size >= threshold) && (null != table[bucketIndex])) { // 2 倍扩容 resize(2 * table.length); // key 为 null,则 hash 值为 0;否则 通过 hash() 方法计算 hash 值 hash = (null != key) ? hash(key) : 0; // 计算扩容之后 entry 将要被放在的 bucket 索引 bucketIndex = indexFor(hash, table.length); } // 实际添加 entry 对象的方法 createEntry(hash, key, value, bucketIndex); }
/** * Rehashes the contents of this map into a new array with a * larger capacity. This method is called automatically when the * number of keys in this map reaches its threshold. * Rehash 当前 map 对象到一个新的更大的数组中去;该方法在 map.size() 达到临界值时自动调用 * If current capacity is MAXIMUM_CAPACITY, this method does not * resize the map, but sets threshold to Integer.MAX_VALUE. * This has the effect of preventing future calls. * 如果当前容量达到允许的最大容量时,该方法不会再进行扩容操作,而是直接将临界值设置为 Integer.MAX_VALUE。 * @param newCapacity the new capacity, MUST be a power of two; * must be greater than current capacity unless current * capacity is MAXIMUM_CAPACITY (in which case value * is irrelevant). */ voidresize(int newCapacity) { Entry[] oldTable = table; intoldCapacity= oldTable.length; // MAXIMUM_CAPACITY = 2^30,即 oldCapacity 最大值为 2^30 if (oldCapacity == MAXIMUM_CAPACITY) { // Integer.MAX_VALUE = 2^31 - 1,之后不可能再进行扩容操作了 threshold = Integer.MAX_VALUE; return; }
public V get(Object key) { if (key == null) return getForNullKey(); Entry<K,V> entry = getEntry(key);
returnnull == entry ? null : entry.getValue(); }
/** * Offloaded version of get() to look up null keys. Null keys map * to index 0. This null case is split out into separate methods * for the sake of performance in the two most commonly used * operations (get and put), but incorporated with conditionals in * others. * key 值为 null 的指向 map 中 Entry 数组 index 为 0 的 entry 对象。为了更高的性能表现,在 get 和 put 请求中,key 为 null 的情况被分支出单独的方法来处理,但在其他方法中并没有这样去做。 */ private V getForNullKey() { if (size == 0) { returnnull; } for (Entry<K,V> e = table[0]; e != null; e = e.next) { if (e.key == null) return e.value; } returnnull; }
final Entry<K,V> getEntry(Object key) { if (size == 0) { returnnull; }
inthash= (key == null) ? 0 : hash(key); // 遍历指定 index 位置的的单项链 for (Entry<K,V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) return e; } returnnull; }
/** * Returns a {@link Set} view of the mappings contained in this map. * The set is backed by the map, so changes to the map are * reflected in the set, and vice-versa. If the map is modified * while an iteration over the set is in progress (except through * the iterator's own <tt>remove</tt> operation, or through the * <tt>setValue</tt> operation on a map entry returned by the * iterator) the results of the iteration are undefined. The set * supports element removal, which removes the corresponding * mapping from the map, via the <tt>Iterator.remove</tt>, * <tt>Set.remove</tt>, <tt>removeAll</tt>, <tt>retainAll</tt> and * <tt>clear</tt> operations. It does not support the * <tt>add</tt> or <tt>addAll</tt> operations. * 返回当前 map 键值对映射的 set 快照(??)。该集合由 map 支持,所以对 map 的修改会反映到 set中,反之亦然。 * 如果 map 在 set 迭代的过程中被修改(除了迭代器本身的remove()和迭代器返回的 map entry的 setValue()操作)话,其结果是未知(??)的。 * set 支持移除元素, Set.remove(),removeAll(), retainAll() 和 clear() 会直接影响到相关的 map。set 不知道 add()和 addAll() 操作。 * @return a set view of the mappings contained in this map */ public Set<Map.Entry<K,V>> entrySet() { return entrySet0(); } (实测,在 for(Map.Entry<String,String> m:map.entrySet()) 循环中,通过 map.put() 方法放入新的值,会抛出 ConcurrentModificationException:👇 Exception in thread "main" java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922) at java.util.HashMap$EntryIterator.next(HashMap.java:962) at java.util.HashMap$EntryIterator.next(HashMap.java:960) at Test.main(Test.java:12) 如果放入 map 中已存在的 key,不会抛出 ConcurrentModificationException。)
private Set<Map.Entry<K,V>> entrySet0() { Set<Map.Entry<K,V>> es = entrySet; return es != null ? es : (entrySet = newEntrySet()); }
privatefinalclassEntrySetextendsAbstractSet<Map.Entry<K,V>> { public Iterator<Map.Entry<K,V>> iterator() { return newEntryIterator(); } publicbooleancontains(Object o) { if (!(o instanceof Map.Entry)) returnfalse; Map.Entry<K,V> e = (Map.Entry<K,V>) o; Entry<K,V> candidate = getEntry(e.getKey()); return candidate != null && candidate.equals(e); } publicbooleanremove(Object o) { return removeMapping(o) != null; } publicintsize() { return size; } publicvoidclear() { HashMap.this.clear(); } } // EntrySet 类集成了 AbstractSet 类,而 AbstractSet 类又实现了 Set<E> 接口,那么 EntrySet 类中的 iterator() 方法就是我们要找的。我们继续顺着往下看 newEntryIterator :👇 Iterator<Map.Entry<K,V>> newEntryIterator() { returnnewEntryIterator(); } privatefinalclassEntryIteratorextendsHashIterator<Map.Entry<K,V>> { public Map.Entry<K,V> next() { return nextEntry(); } } // 形式渐渐明朗,,再看 nextEntry() privateabstractclassHashIterator<E> implementsIterator<E> { Entry<K,V> next; // next entry to return int expectedModCount; // For fast-fail int index; // current slot Entry<K,V> current; // current entry
HashIterator() { expectedModCount = modCount; if (size > 0) { // advance to first entry Entry[] t = table; while (index < t.length && (next = t[index++]) == null) ; } }
publicfinalbooleanhasNext() { return next != null; }
final Entry<K,V> nextEntry() { if (modCount != expectedModCount) thrownewConcurrentModificationException(); Entry<K,V> e = next; if (e == null) thrownewNoSuchElementException();
if ((next = e.next) == null) { Entry[] t = table; while (index < t.length && (next = t[index++]) == null) ; } current = e; return e; }