【Java集合类篇】HashMap的数据结构是怎样的?

HashMap的数据结构是怎样的?

✔️HashMap的数据结构
- [✔️ 数组](#✔️ 数组)
- [✔️ 链表](#✔️ 链表)

✔️HashMap的数据结构

在Java中，保存数据有两种比较简单的数据结构: 数组和链表（或红黑树）。

HashMap是 Java 中常用的数据结构，它实现了 Map 接口。HashMap通过键值对的形式存储数据，其中键是唯一的，而值可以是任何对象。HashMap底层使用数组和链表（或红黑树）来实现。

常用的哈希函数的冲突解决办法中有一种方法叫做链地址法，其实就是将数组和链表组合在一起，发挥了两者的优势，我们可以将其理解为链表的数组。在JDK 1.8之前，HashMap就是通过这种结构来存储数据的。

我们可以从上图看到，左边很明显是个数组，数组的每个成员是一个链表。该数据结构所容纳的所有元素均包含一个指针，用于元素间的链接。我们根据元素的自身特征把元素分配到不同的链表中去，反过来我们也正是通过这些特征找到正确的链表，再从链表中找出正确的元素。其中，根据元素特征计算元素数组下标的方法就是哈希算法，即本文的主角 hash() 函数 (当然，还包括indexOf()函数)。

✔️ 数组

数组：HashMap使用一个数组来存储键值对。数组的每个元素都是一个桶（bucket），桶中存储着一个链表（LinkedList）或红黑树（TreeMap）。桶的数量可以根据需要动态调整。数组的索引方式采用哈希算法，通过将键的哈希值对数组长度取模来得到对应的桶。

数组的特点是:寻址容易，插入和删除困难。

看一个如何使用数组实现HashMap的代码片段：

java 复制代码

public class MyHashMap<K, V> { 
	// 默认初始容量  
    private static final int DEFAULT_INITIAL_CAPACITY = 16;  
     // 默认加载因子  
    private static final float DEFAULT_LOAD_FACTOR = 0.75f;  
     // 存储键值对的数组 
    private Entry<K, V>[] table;  
    // 当前容量 
    private int capacity; 
    // 实际存储的键值对数量   
    private int size;  
    
    public MyHashMap() {  
        this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);  
    }  
  
    public MyHashMap(int capacity) {  
        this(capacity, DEFAULT_LOAD_FACTOR);  
    }  
  
    public MyHashMap(int capacity, float loadFactor) {  
        this.capacity = capacity;  
        table = new Entry[capacity];  
        size = 0;  
    }  
  
    public V put(K key, V value) {  
        int hash = hash(key);  
        int index = indexFor(hash, table.length);  
        Entry<K, V> oldValue = table[index];  
        if (oldValue == null) {  
            table[index] = new Entry<>(key, value, null);  
            size++;  
            if (size > capacity * loadFactor) {  
                rehash();  
            }  
        } else {  
            Entry<K, V> newEntry = new Entry<>(key, value, oldValue);  
            table[index] = newEntry;  
        }  
        return oldValue == null ? null : oldValue.value;  
    }  
  
    public V get(K key) {  
        int hash = hash(key);  
        int index = indexFor(hash, table.length);  
        Entry<K, V> entry = table[index];  
        if (entry != null && Objects.equals(entry.key, key)) {  
            return entry.value;  
        } else {  
            return null;  
        }  
    }  
  
    public int size() {  
        return size;  
    }  
  
    private int hash(Object key) {  
        return Objects.hashCode(key);  
    }  
  
    private int indexFor(int hash, int length) {  
        return hash % length;  
    }  
  
    private void rehash() {  
        Entry<K, V>[] oldTable = table;  
        int oldCapacity = oldTable.length;  
        int newCapacity = oldCapacity * 2;  
        Entry<K, V>[] newTable = new Entry[newCapacity];  
        for (Entry<K, V> oldEntry : oldTable) {  
            while (oldEntry != null) {  
                Entry<K, V> next = oldEntry.next;  
                int hash = hash(oldEntry.key);  
                int index = indexFor(hash, newCapacity);  
                oldEntry.next = newTable[index];  
                newTable[index] = oldEntry;  
                oldEntry = next;  
            }  
        }  
        table = newTable;  
        capacity = newCapacity;  
    }  
}

✔️ 链表

链表：当多个键的哈希值映射到同一个桶时，它们会形成一个链表。链表中的每个节点包含一个键值对和指向下一个节点的指针。链表的作用是在插入、删除和查找操作时解决哈希冲突。

链表的特点是: 寻址困难，插入和删除容易

看一个如何使用链表实现HashMap的代码片段，是一个简单的HashMap实现，使用链表来处理哈希冲突：

java 复制代码

public class MyHashMap<K, V> {  
    private static class Entry<K, V> {  
        K key;  
        V value;  
        Entry<K, V> next;  
  
        Entry(K key, V value, Entry<K, V> next) {  
            this.key = key;  
            this.value = value;  
            this.next = next;  
        }  
    }  
  
    private Entry<K, V>[] table;  
    private int capacity;  
    private int size;  
    private float loadFactor;  
  
    public MyHashMap(int capacity, float loadFactor) {  
        if (capacity < 0) throw new IllegalArgumentException("Capacity must be non-negative");  
        if (loadFactor <= 0 || Float.isNaN(loadFactor)) throw new IllegalArgumentException("Load factor must be positive");  
        this.capacity = capacity;  
        this.loadFactor = loadFactor;  
        table = new Entry[capacity];  
        size = 0;  
    }  
  
    public V put(K key, V value) {  
        if (key == null) return null; // HashMaps don't allow null keys  
  
        // If size exceeds load factor * capacity, rehash  
        if (size >= capacity * loadFactor) {  
            rehash();  
        }  
  
        int hash = hash(key);  
        int index = indexFor(hash, capacity);  
  
        Entry<K, V> entry = table[index];  
        if (entry == null) {  
            // No collision, create new entry  
            table[index] = new Entry<>(key, value, null);  
            size++;  
        } else {  
            // Collision occurred, handle it using chaining  
            while (entry != null && !entry.key.equals(key)) {  
                if (entry.next == null) {  
                    // End of chain, insert new entry  
                    entry.next = new Entry<>(key, value, null);  
                    size++;  
                    break;  
                }  
                entry = entry.next;  
            }  
  
            // If key already exists, update value  
            if (entry != null && entry.key.equals(key)) {  
                V oldValue = entry.value;  
                entry.value = value;  
                return oldValue;  
            }  
        }  
  
        return null; // If key was new or not found  
    }  
  
    public V get(K key) {  
        if (key == null) return null; // HashMaps don't allow null keys  
  
        int hash = hash(key);  
        int index = indexFor(hash, capacity);  
  
        Entry<K, V> entry = table[index];  
        while (entry != null && !entry.key.equals(key)) {  
            entry = entry.next;  
        }  
  
        return entry == null ? null : entry.value;  
    }  
  
    private void rehash() {  
        capacity *= 2;  
        Entry<K, V>[] oldTable = table;  
        table = new Entry[capacity];  
        size = 0;  
  
        for (Entry<K, V> entry : oldTable) {  
            while (entry != null) {  
                Entry<K, V> next = entry.next;  
                int hash = hash(entry.key);  
                int index = indexFor(hash, capacity);  
                entry.next = table[index];  
                table[index] = entry;  
                size++;  
                entry = next;  
            }  
        }  
    }  
  
    private int hash(K key) {  
        return Math.abs(key.hashCode()) % capacity;  
    }  
  
    private int indexFor(int hash, int length) {  
        return hash % length;  
    }  
  
    public static void main(String[] args) {  
        MyHashMap<String, Integer> map = new MyHashMap<>(16, 0.75f);  
        map.put("one", 1);  
        map.put("two", 2);  
        map.put("three", 3);  
  
        System.out.println(map.get("one"));    // Should print 1  
        System.out.println(map.get("two"));    // Should print 2  
        System.out.println(map.get("three"));  //Should print 3

在JDK 1.8中为了解决因hash冲突导致某个链表长度过长，影响 put 和 get 的效率，引入了红黑树。

关于红黑树，下一篇会作为单独的博文进行更新。