Map (三)：Hashtable 解析

写在前面

由于个人水平有限，故部分内容可能会出现错误，还请包涵

Hashtable

1
2
3

public class Hashtable<K,V>
    extends Dictionary<K,V>
    implements Map<K,V>, Cloneable, java.io.Serializable

Hashtable 是一个散列表，它存储的内容是键值对 (key-value) 映射

Hashtable 继承于 Dictionary，实现了 Map、Cloneable、java.io.Serializable 接口

Hashtable 的函数都是同步的，这意味着它是线程安全的。它的 key、value 都不可以为 null

构造函数

Hashtable 中的构造方法核心为

public Hashtable(int initialCapacity, float loadFactor) {

    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
                                           
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal Load: "+loadFactor);

    if (initialCapacity==0)
        initialCapacity = 1;
        
    this.loadFactor = loadFactor;
    
    table = new Entry<?,?>[initialCapacity];
    
    threshold = (int)Math.min(initialCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
    
}

其中 initialCapacity 的默认初始值为 11，loadFactor 的默认值为 0.75，而 HashMap 的 initialCapacity 默认初始值为 16。那么为什么要这样设置呢？
因为 Hashtable 中的确定某个元素在 table 数组中 index 的方式为取模，而对质数取模的分散效果更好

1	int index = (hash & 0x7FFFFFFF) % tab.length;

而 HashMap 中确定 index 的方式是位或运算

1 2	// n 为 table 数组长度 i = (n - 1) & hash

而 hash 值的计算是需要进行二次 hash

static final int hash(Object key) {

    int h;
    
    // h = key.hashCode() 为第一次 hash
    // 再与 (h >>> 16) 异或运算为第二次 hash
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    
}

为什么 Hashtable 和 HashMap 方法不采用相同的求 index 的方法，原因暂时不清楚

rehash

Hashtable 中的 rehash() 方法相当于 HashMap 中的 resize() 方法，不同的是 Hashtable 每次扩容后新数组的容量是原数组容量的两倍加一

protected void rehash() {

    int oldCapacity = table.length;
    
    Entry<?,?>[] oldMap = table;

    // overflow-conscious code
    // 数组扩容
    int newCapacity = (oldCapacity << 1) + 1;
    
    if (newCapacity - MAX_ARRAY_SIZE > 0) {
    
        if (oldCapacity == MAX_ARRAY_SIZE)
            // Keep running with MAX_ARRAY_SIZE buckets
            return;
            
        newCapacity = MAX_ARRAY_SIZE;
        
    }
    
    Entry<?,?>[] newMap = new Entry<?,?>[newCapacity];

    modCount++;
    
    threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
    
    table = newMap;

    // 将原数组中的 entry 映射到新数组中
    for (int i = oldCapacity ; i-- > 0 ;) {
    
        for (Entry<K,V> old = (Entry<K,V>)oldMap[i] ; old != null ; ) {
        
            Entry<K,V> e = old;
            
            old = old.next;

            int index = (e.hash & 0x7FFFFFFF) % newCapacity;
            
            // 将新 entry 放在链表的头部
            e.next = (Entry<K,V>)newMap[index];
            
            newMap[index] = e;
            
        } 
        
    }
    
}

put

public synchronized V put(K key, V value) {

    // Make sure the value is not null
    if (value == null) {
    
        throw new NullPointerException();
        
    }

    // Makes sure the key is not already in the hashtable.
    Entry<?,?> tab[] = table;
    
    int hash = key.hashCode();
    
    int index = (hash & 0x7FFFFFFF) % tab.length;
    
    @SuppressWarnings("unchecked")
    
    Entry<K,V> entry = (Entry<K,V>)tab[index];
    
    for(; entry != null ; entry = entry.next) {
        
        // 若已存在相等的 key 值，则将新的 value 替换旧的 value
        if ((entry.hash == hash) && entry.key.equals(key)) {
        
            V old = entry.value;
            
            entry.value = value;
            
            return old;
            
        }
        
    }

    // 若不存在相等的 key 值，则加入新的 entry 元素
    addEntry(hash, key, value, index);
    
    return null;
    
}

addEntry 方法源码如下

private void addEntry(int hash, K key, V value, int index) {

    modCount++;

    Entry<?,?> tab[] = table;
    
    // count 相当于 HashMap 中的 size
    // 先扩容再加入新 entry
    // 对比 HashMap 先加新 entry 再扩容
    // 而 HashMap 对应的判断条件为 ++size > threshold，注意没有 = 号
    if (count >= threshold) {
    
        // Rehash the table if the threshold is exceeded
        rehash();

        tab = table;
        
        hash = key.hashCode();
        
        index = (hash & 0x7FFFFFFF) % tab.length;
        
    }

    // Creates the new entry.
    @SuppressWarnings("unchecked")
    Entry<K,V> e = (Entry<K,V>) tab[index];
    
    // 使用头插法
    tab[index] = new Entry<>(hash, key, value, e);
    
    count++;
    
}

get

get 方法比较简单，就不做介绍了，只要知道这个方法有 synchronized 关键字修饰

public synchronized V get(Object key) {

    Entry<?,?> tab[] = table;
    
    int hash = key.hashCode();
    
    int index = (hash & 0x7FFFFFFF) % tab.length;
    
    for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {
    
        if ((e.hash == hash) && e.key.equals(key)) {
        
            return (V)e.value;
            
        }
        
    }
    
    return null;
    
}

remove

remove 方法本质上也就是链表删除一个节点，比较好理解

public synchronized V remove(Object key) {

    Entry<?,?> tab[] = table;
    
    int hash = key.hashCode();
    
    int index = (hash & 0x7FFFFFFF) % tab.length;
    
    @SuppressWarnings("unchecked")
    Entry<K,V> e = (Entry<K,V>)tab[index];
    
    for(Entry<K,V> prev = null ; e != null ; prev = e, e = e.next) {
    
        if ((e.hash == hash) && e.key.equals(key)) {
        
            modCount++;
            
            if (prev != null) {
            
                prev.next = e.next;
                
            } else {
            
                tab[index] = e.next;
                
            }
            
            count--;
            
            V oldValue = e.value;
            
            e.value = null;
            
            return oldValue;
            
        }
        
    }
    
    return null;
    
}

同时，remove 有一个重载版本，其作用是只有当 key 和 value 都相等时才删除该 entry

遍历

除了 Map 接口中定义的用于遍历的三种方式：entrySet()、keySet()、values()，Hashtable 中还定义了 keys() 方法，返回值类型为 Enumeration

public synchronized Enumeration<K> keys() {

    return this.<K>getEnumeration(KEYS);
    
}

// Types of Enumerations/Iterations
private static final int KEYS = 0;
private static final int VALUES = 1;
private static final int ENTRIES = 2;

而查看 entrySet()、keySet()、values() 的源码会发现这三个方法内部都调用了 getIterator(int type) 方法，而不管是 getEnumeration(int type) 还是 getIterator(int type) 方法，只要 table 内的元素个数不为 0，最终都会返回一个 Enumerator 对象

// Enumrator 同时实现了 Enumration 接口和 Iterator 接口
// 故 Enumrator 对象既可以被看作 Enumeration 对象，也可以被看作 Iterator 对象
private class Enumerator<T> implements Enumeration<T>, Iterator<T> {

    Entry<?,?>[] table = Hashtable.this.table;
    
    int index = table.length;
    
    Entry<?,?> entry;
    
    Entry<?,?> lastReturned;
    
    int type;

    /**
     * Indicates whether this Enumerator is serving as an Iterator
     * or an Enumeration.  (true -> Iterator).
     */
    boolean iterator;

    /**
     * The modCount value that the iterator believes that the backing
     * Hashtable should have.  If this expectation is violated, the iterator
     * has detected concurrent modification.
     */
    protected int expectedModCount = modCount;

    Enumerator(int type, boolean iterator) {
    
        this.type = type;
        
        this.iterator = iterator;
        
    }

    public boolean hasMoreElements() {
    
        Entry<?,?> e = entry;
        
        int i = index;
        
        Entry<?,?>[] t = table;
        
        /* Use locals for faster loop iteration */
        // 从后往前遍历 table 数组
        while (e == null && i > 0) {
        
            e = t[--i];
            
        }
        
        // 若 e == null，则将 table 数组中前一个位置的元素赋给 entry
        // 若 e != null，则 entry 不发生改变
        // 而 entry 在链表中的遍历在 nextElement() 方法中实现
        entry = e;
        
        index = i;
        
        return e != null;
        
    }

    @SuppressWarnings("unchecked")
    public T nextElement() {
    
        Entry<?,?> et = entry;
        
        int i = index;
        
        Entry<?,?>[] t = table;
        
        /* Use locals for faster loop iteration */
        while (et == null && i > 0) {
        
            et = t[--i];
            
        }
        
        entry = et;
        
        index = i;
        
        if (et != null) {
        
            Entry<?,?> e = lastReturned = entry;
            
            // 将 entry 沿着链表往后移实现链表的遍历
            entry = e.next;
            
            return type == KEYS ? (T)e.key : (type == VALUES ? (T)e.value : (T)e);
            
        }
        
        throw new NoSuchElementException("Hashtable Enumerator");
        
    }

    // Iterator methods
    public boolean hasNext() {
    
        return hasMoreElements();
        
    }

    public T next() {
    
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
            
        return nextElement();
        
    }

    public void remove() {
        
        // Enumration 接口没有定义 remove() 方法
        if (!iterator)
            throw new UnsupportedOperationException();
            
        if (lastReturned == null)
            throw new IllegalStateException("Hashtable Enumerator");
            
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();

        synchronized(Hashtable.this) {
        
            Entry<?,?>[] tab = Hashtable.this.table;
            
            int index = (lastReturned.hash & 0x7FFFFFFF) % tab.length;

            @SuppressWarnings("unchecked")
            Entry<K,V> e = (Entry<K,V>)tab[index];
            
            for(Entry<K,V> prev = null; e != null; prev = e, e = e.next) {
            
                if (e == lastReturned) {
                
                    modCount++;
                    
                    expectedModCount++;
                    
                    if (prev == null)
                        tab[index] = e.next;
                        
                    else
                        prev.next = e.next;
                        
                    count--;
                    
                    lastReturned = null;
                    
                    return;
                    
                }
                
            }
            
            throw new ConcurrentModificationException();
            
        }
        
    }
    
}

除此之外，Hashtable 中还定义了 elements() 方法，其返回值为 Enumeration 对象，其中的元素为 hashtable 中的 values 值，核心内容依旧是调用 getEnumeration(int type) 方法。在之前的 Java List 解析中的 Vector 部分我们会发现 Vector 类也有相似的 elements() 方法，而 Vector 同样是线程安全的。

Hashtable 和 HashMap 的区别

Hashtable 继承自 Dictionary 类，而 HashMap 继承自 AbstractMap 类。Dictionary 类是任何可将键映射到相应值的类的抽象父类，而 AbstractMap 是基于 Map 接口的骨干实现，它以最大限度地减少实现此接口所需的工作

HashMap 可以允许存在一个为 null的的 key 和任意个为 null 的 value，但是 Hashtable 中的 key 和 value 都不允许为 null

HashMap 是单线程安全的，Hashtable 是多线程安全的

HashMap 仅支持 Iterator 的遍历方式，而 Hashtable 支持 Iterator 和 Enumeration 两种遍历方式

更细微的区别有：

HashMap 的底层数据结构是数组 + 链表 + 红黑树，而 Hashtable 的底层数据结构是数组 + 链表

HashMap 默认初始容量为 16，Hashtable 默认初始容量为 11

HashMap 是先加入新节点（新节点放在链表尾部）再扩容（扩容之后新数组容量为之前的两倍），而 Hashtable 是先扩容（扩容之后新数组为之前的两倍加一）再加入新节点（新节点放在链表头部）

HashMap 求某个 key 值在数组中的 index 采用的是

1	int index = hash(key) & (tab.length - 1);

而 Hashtable 采用的是

1	int index = (hash & 0x7FFFFFFF) % tab.length;

由于求取 index 的方法不同，HashMap 原数组中 i 位置的元素扩容之后只会映射到新数组中 i 位置或 i + oldTab.length 位置，而 Hashtable 原数组中 i 位置的元素扩容之后映射到新数组中的位置和 i 无确定关系