ConcurrentHashMap高效并发机制深度解析

前言

ConcurrentHashMap是Java并发包java.util.concurrent中最核心的组件之一，专为高并发场景设计。相比HashMap的线程不安全、Hashtable的性能低下，ConcurrentHashMap通过精巧的锁设计和无锁算法，实现了高性能与线程安全的完美平衡。本文将从源码角度深度剖析ConcurrentHashMap的实现原理。

一、设计理念与演进

1.1 JDK 1.7：分段锁

JDK 1.7的ConcurrentHashMap采用Segment数组 + HashEntry数组的分段锁设计：

复制代码

public class ConcurrentHashMap<K, V> extends AbstractMap<K, V>
        implements ConcurrentMap<K, V>, Serializable {

    // Segment数组，类似HashMap的桶数组
    final Segment<K,V>[] segments;

    // 每个Segment继承ReentrantLock，支持重入
    static final class Segment<K,V> extends ReentrantLock {
        transient volatile int count;
        transient int modCount;
        transient int threshold;
        transient volatile HashEntry<K,V>[] table;
    }

    // HashEntry节点结构
    static final class HashEntry<K,V> {
        final int hash;
        final K key;
        volatile V value;
        final HashEntry<K,V> next;
    }
}

锁细化：每个Segment独立加锁，最多同时支持concurrencyLevel个线程并发写入（默认16）。

1.2 JDK 1.8：CAS + Synchronized

JDK 1.8彻底抛弃分段锁，改用CAS + synchronized实现：

复制代码

// JDK 1.8 的 Node 结构
static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    volatile V value;
    volatile Node<K,V> next;
}

// 核心数组
transient volatile Node<K,V>[] table;

// 计数器和扩容戳
private transient volatile long baseCount;
private transient volatile int sizeCtl;

优化点 ：

使用synchronized代替ReentrantLock，锁粒度细化到单个桶
引入CounterCell提高size()方法的并发性能
红黑树优化（与HashMap一致）

二、CAS无锁操作

2.1 CAS的核心原理

CAS（Compare-And-Swap）是CPU提供的原子指令，通过硬件保证操作的原子性：

复制代码

// CAS操作的伪代码
boolean compareAndSwap(Object obj, long offset, Object expected, Object new) {
    if (obj[offset] == expected) {
        obj[offset] = new;
        return true;
    }
    return false;
}

2.2 ConcurrentHashMap中的CAS应用

复制代码

// 1. 初始化数组
private final Node<K,V>[] initTable() {
    Node<K,V>[] tab;
    int sc;
    while ((tab = table) == null || tab.length == 0) {
        // sizeCtl < 0 表示正在初始化
        if ((sc = sizeCtl) < 0)
            Thread.yield();
        // CAS设置sizeCtl为-1，表示抢到了初始化权
        else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
            try {
                if ((tab = table) == null || tab.length == 0) {
                    int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
                    @SuppressWarnings("rawtypes")
                    Node<K,V>[] nt = new Node<?,?>[n];
                    table = tab = nt;
                    sc = n - (n >>> 2);  // 扩容阈值 = 0.75 * n
                }
            } finally {
                sizeCtl = sc;
            }
            break;
        }
    }
    return tab;
}

// 2. CAS方式插入新节点
if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
    // 使用CAS设置数组元素为新节点
    if (casTabAt(tab, i, null, new Node<K,V>(hash, key, value, null)))
        break;
}

// 3. 链表节点的CAS设置
static final class Node<K,V> {
    volatile V val;
    volatile Node<K,V> next;

    // Unsafe原子操作
    private static final Unsafe U = ...;
    private static final long VAL_OFFSET = ...;
    private static final long NEXT_OFFSET = ...;
}

三、Synchronized锁的精细化

3.1 为什么需要synchronized

CAS虽然高效，但只能保证单个变量的原子性。在以下场景中，需要synchronized：

链表/红黑树的插入和删除操作
哈希冲突解决
扩容操作

3.2 锁的具体实现

复制代码

final V putVal(K key, V value, boolean onlyIfAbsent) {
    if (key == null || value == null) throw new NullPointerException();
    int hash = spread(key.hashCode());
    int binCount = 0;

    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f;
        int n, i, fh;

        // 1. 数组未初始化，先初始化
        if (tab == null || (n = tab.length) == 0)
            tab = initTable();

        // 2. 桶为空，CAS方式插入
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            if (casTabAt(tab, i, null, new Node<K,V>(hash, key, value, null)))
                break;
        }

        // 3. 正在扩容，协助扩容
        else if ((fh = f.hash) == MOVED)
            tab = helpTransfer(tab, f);

        // 4. 桶中有元素，需要加锁处理
        else {
            V oldVal = null;
            // 只锁定当前桶，不影响其他桶的并发操作
            synchronized (f) {
                // 再次确认f没有被改变
                if (tabAt(tab, i) == f) {
                    if (fh >= 0) {
                        // 链表处理
                        binCount = 1;
                        for (Node<K,V> e = f;; ++binCount) {
                            K ek;
                            if (e.hash == hash && 
                                ((ek = e.key) == key || 
                                 (ek != null && key.equals(ek)))) {
                                oldVal = e.val;
                                if (!onlyIfAbsent)
                                    e.val = value;
                                break;
                            }
                            Node<K,V> pred = e;
                            if ((e = e.next) == null) {
                                pred.next = new Node<K,V>(hash, key, value, null);
                                break;
                            }
                        }
                    }
                    else if (f instanceof TreeBin) {
                        // 红黑树处理
                        Node<K,V> p;
                        binCount = 2;
                        if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) != null) {
                            oldVal = p.val;
                            if (!onlyIfAbsent)
                                p.val = value;
                        }
                    }
                }
            }

            if (binCount != 0) {
                if (binCount >= TREEIFY_THRESHOLD)
                    treeifyBin(tab, i);
                if (oldVal != null)
                    return oldVal;
                break;
            }
        }
    }
    addCount(1L, binCount);
    return null;
}

四、扩容机制

4.1 并发扩容原理

ConcurrentHashMap的扩容是多线程并发的，通过sizeCtl控制：

复制代码

// sizeCtl 的含义：
// -1：正在初始化
// -N：正在有N-1个线程进行扩容
// 正数：扩容阈值或未初始化时的容量

private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab) {
    int n = tab.length, stride;

    // 每个线程处理的桶数量，最少16个
    if ((stride = (NCPU > 1) ? (n >>> 3) / NCPU : n) < MIN_TRANSFER_STRIDE)
        stride = MIN_TRANSFER_STRIDE;

    // 第一个线程创建nextTab
    if (nextTab == null) {
        try {
            @SuppressWarnings("rawtypes")
            Node<K,V>[] nt = (Node<K,V>[]) new Node<?,?>[n << 1];
            nextTab = nt;
        } catch (Throwable ex) {
            sizeCtl = Integer.MAX_VALUE;
            break;
        }
        nextTable = nextTab;
        transferIndex = n;
    }

    int nextn = nextTab.length;

    // 遍历当前桶，将节点移动到新数组
    for (int i = 0, bound = 0;;) {
        Node<K,V> f;
        int bing;

        // 每个线程领取一段桶进行处理
        if (i >= bound || finishing) {
            // ... 领取任务逻辑
        }

        // 处理单个桶
        if ((f = tabAt(tab, i)) != null) {
            if ((fh = f.hash) == MOVED) {
                tab = nextTab;
                continue;
            }

            synchronized (f) {
                if (tabAt(tab, i) == f) {
                    // 分裂链表/红黑树到新数组
                    // 与HashMap类似，利用hash & oldCap 判断位置
                    split: {
                        if (fh >= 0) {
                            int runBit = fh & n;
                            Node<K,V> lastRun = f;
                            // ...
                        }
                    }
                }
            }
        }
    }
}

// 帮助扩容
final Node<K,V>[] helpTransfer(Node<K,V>[] tab, Node<K,V> f) {
    Node<K,V>[] nextTab;
    int sc;
    if (tab != null && f instanceof ForwardingNode &&
        (nextTab = ((ForwardingNode<K,V>)f).nextTable) != null) {
        int rs = resizeStamp(tab.length);
        while (nextTab == nextTable && table == tab &&
               (sc = sizeCtl) < 0) {
            if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
                sc == rs + MAX_RESIZERS || transferIndex <= 0)
                break;
            if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1)) {
                transfer(tab, nextTab);
                break;
            }
        }
    }
    return tab;
}

4.2 扩容时的读写并发

JDK 1.8的ConcurrentHashMap支持读写并发：

读操作：直接读取，碰到ForwardingNode会跳到nextTable
写操作：帮助扩容或在新数组上写入
删除操作：与写操作类似

五、size()方法的并发优化

5.1 baseCount + CounterCell

直接使用volatile的baseCount无法保证复合操作的原子性，JDK 1.8引入CounterCell：

复制代码

// 计数值
private transient volatile long baseCount;
private transient volatile CounterCell[] counterCells;

// CounterCell - 分散计数的单元
@sun.misc.Contended
static final class CounterCell {
    volatile long value;
    CounterCell(long x) { value = x; }
}

// 添加计数
final long addCount(long x, int check) {
    CounterCell[] as;
    long b, s;

    if ((as = counterCells) != null ||
        !U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) {
        // counterCells已存在或CAS失败，尝试CAS更新CounterCell
        CounterCell a;
        long v;
        int m;
        boolean uncontended = true;

        if (as == null || (m = as.length - 1) < 0 ||
            (a = as[ThreadLocalRandom.getProbe() & m]) == null ||
            !(uncontended = U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) {
            // 仍然失败，调用fullAddCount
            fullAddCount(x, uncontended);
        }
    }

    if (check >= 0) {
        // ... 检查是否需要扩容
    }
    return s;
}

5.2 size()方法实现

复制代码

public int size() {
    long n = sumCount();
    return (n < 0) ? 0 : (n > Integer.MAX_VALUE) ? Integer.MAX_VALUE : (int)n;
}

final long sumCount() {
    CounterCell[] as = counterCells;
    long sum = baseCount;
    if (as != null) {
        for (CounterCell a : as) {
            sum += a.value;
        }
    }
    return sum;
}

六、使用场景与最佳实践

6.1 典型使用场景

复制代码

// 1. 缓存场景
ConcurrentHashMap<String, User> userCache = new ConcurrentHashMap<>();

// 2. 计数器场景（使用computeIfAbsent保证原子性）
ConcurrentHashMap<String, AtomicLong> adCounter = new ConcurrentHashMap<>();
adCounter.computeIfAbsent("click", k -> new AtomicLong()).incrementAndGet();

// 3. 批量操作（不保证原子性）
concurrentMap.putAll(anotherMap);

// 4. 原子替换
concurrentMap.replace("key", oldValue, newValue);

6.2 注意事项

复制代码

// ❌ 不要这样做 - 复合操作不是原子的
if (map.containsKey(key)) {
    map.remove(key);  // 可能被其他线程先移除
}

// ✅ 正确做法 - 使用原子操作
map.remove(key, expectedValue);  // 删除仅当value匹配
map.computeIfAbsent(key, k -> createExpensiveValue());  // 原子初始化
map.putIfAbsent(key, value);  // 仅当不存在时插入

// ❌ 迭代期间修改可能导致ConcurrentModificationException
for (Map.Entry<String, String> entry : map.entrySet()) {
    // 不要在迭代中修改map
}

// ✅ 正确做法 - 使用原子操作
map.replaceAll((k, v) -> v.toUpperCase());  // 原子替换

6.3 与其他Map的性能对比

复制代码

// 性能测试
public class MapPerformanceTest {
    public static void main(String[] args) throws Exception {
        int size = 1_000_000;
        int threads = 16;

        // HashMap - 线程不安全
        testMap(new HashMap<>(), size, threads, "HashMap");

        // Hashtable - 全局锁
        testMap(new Hashtable<>(), size, threads, "Hashtable");

        // ConcurrentHashMap - 分段锁/CAS
        testMap(new ConcurrentHashMap<>(), size, threads, "ConcurrentHashMap");
    }

    private static void testMap(Map<String, Integer> map, 
                                 int size, int threads, String name) 
            throws InterruptedException {
        ExecutorService executor = Executors.newFixedThreadPool(threads);
        long start = System.currentTimeMillis();

        for (int i = 0; i < threads; i++) {
            executor.submit(() -> {
                for (int j = 0; j < size / threads; j++) {
                    map.put("key" + j, j);
                }
            });
        }

        executor.shutdown();
        executor.awaitTermination(1, TimeUnit.MINUTES);

        System.out.println(name + ": " + (System.currentTimeMillis() - start) + "ms");
    }
}

总结

ConcurrentHashMap是Java并发编程中的利器，其设计精髓在于：

JDK 1.7分段锁：锁细化到Segment级别，支持更高并发
JDK 1.8 CAS + synchronized：综合无锁和有锁的优点
并发扩容：多线程协作完成扩容，几乎不影响读写
CounterCell计数：分散热点，提高size()方法的并发性能

在实际开发中，如果涉及并发操作HashMap，请务必使用ConcurrentHashMap，避免数据竞争导致的难以排查的线上问题。