JDK7 ConcurrentHashMap principle

ConcurrentHashMap in JDK 7 (Segment-Based)

For the Java 8+ version (Node\[\] + per-bin locking), see concurrenthashmap-internals.md.

Architecture

复制代码
ConcurrentHashMap
  └── Segment[16]                          (default concurrencyLevel = 16)
        ├── Segment[0] extends ReentrantLock
        │     └── volatile HashEntry[] table
        │           ├── [0] → Entry → Entry → null    (linked list only)
        │           ├── [1] → null
        │           └── [n] → Entry → null
        ├── Segment[1] extends ReentrantLock
        │     └── volatile HashEntry[] table
        │           └── ...
        └── Segment[15]
              └── volatile HashEntry[] table
                    └── ...

Two-level hashing: upper bits select the Segment, lower bits select the bin within that Segment.

Key Classes

Segment

java 复制代码
static final class Segment<K,V> extends ReentrantLock {
    volatile int count;                    // number of entries in this segment
    int modCount;                          // structural modification count
    int threshold;                         // resize threshold for this segment
    volatile HashEntry<K,V>[] table;       // the hash table for this segment
    final float loadFactor;
}

Each Segment is an independent mini-HashMap with its own lock, count, and resize threshold.

HashEntry

java 复制代码
static final class HashEntry<K,V> {
    final int hash;
    final K key;
    volatile V value;              // volatile for lock-free reads
    volatile HashEntry<K,V> next;  // volatile for lock-free reads (JDK 7u6+)
}

key and hash are final --- once created, a HashEntry's identity never changes. Only value and next can be updated.

get() --- Lock-Free

复制代码
get(key)
     |
     |--- hash = hash(key.hashCode())
     |--- segmentIndex = (hash >>> segmentShift) & segmentMask
     |--- segment = segments[segmentIndex]          // volatile read
     |--- tab = segment.table                       // volatile read
     |--- e = tab[(tab.length - 1) & hash]          // volatile read
     |
     |--- while (e != null) {                       // linked list traversal
     |        if (e.hash == hash && key.equals(e.key))
     |            return e.value;                    // volatile read
     |        e = e.next;                            // volatile read
     |    }
     |--- return null

No lock acquired. Relies entirely on volatile reads of value and next.

put() --- Segment-Level Lock

复制代码
put(key, value)
     |
     |--- hash = hash(key.hashCode())
     |--- segment = segmentFor(hash)
     |
     |--- segment.lock()                    // ReentrantLock on this segment only
     |    |
     |    |--- tab = segment.table
     |    |--- index = (tab.length - 1) & hash
     |    |--- first = tab[index]
     |    |
     |    |--- traverse linked list:
     |    |    for (e = first; e != null; e = e.next)
     |    |        if (e.hash == hash && key.equals(e.key))
     |    |            oldValue = e.value
     |    |            e.value = value      // update existing
     |    |            return oldValue
     |    |
     |    |--- [key not found → insert]
     |    |    tab[index] = new HashEntry(hash, key, value, first)
     |    |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     |    |    HEAD PREPEND (new node.next = old first)
     |    |
     |    |--- count++
     |    |--- [count > threshold?] → rehash() (per-segment resize)
     |    |
     |--- segment.unlock()
     |
     |--- return null (new key)

Key difference from Java 8+: new nodes are HEAD-PREPENDED (not tail-appended). This is safe because get() reads tab[index] via volatile --- it sees either the old head or the new head, both consistent.

Resize --- Per-Segment

Each Segment resizes independently. Only the Segment being resized is locked.

复制代码
rehash() --- called inside segment.lock()
     |
     |--- oldTable = segment.table
     |--- newTable = new HashEntry[oldTable.length * 2]
     |
     |--- for each bin in oldTable:
     |    |--- e = oldTable[i]
     |    |--- if (e == null) → skip
     |    |--- if (e.next == null) → newTable[e.hash & newMask] = e  (reuse single node)
     |    |--- else:
     |    |    |--- find lastRun (tail of same-destination nodes) → REUSE
     |    |    |--- create NEW nodes for everything before lastRun
     |    |    |    (same optimization as Java 8+)
     |
     |--- segment.table = newTable          // volatile write

Other Segments are completely unaffected --- no cooperative transfer, no ForwardingNode.

复制代码
Segment[0]: resizing (locked)     → only threads hitting Segment[0] are blocked
Segment[1]: normal operation      → fully concurrent
Segment[2]: normal operation      → fully concurrent
...
Segment[15]: normal operation     → fully concurrent

remove() --- Segment-Level Lock

复制代码
remove(key)
     |
     |--- segment.lock()
     |    |
     |    |--- find node in linked list
     |    |--- [found?]
     |    |    |--- rebuild chain WITHOUT the removed node
     |    |    |    (creates new nodes from head to removed node)
     |    |    |    (reuses nodes after removed node)
     |    |    |--- tab[index] = new chain head
     |    |    |--- count--
     |    |
     |--- segment.unlock()

In JDK 7, remove() rebuilds the chain because HashEntry.next was originally final (changed to volatile in 7u6). The rebuild creates new nodes for the prefix and reuses the suffix after the removed node.

size() --- Two-Pass Strategy

复制代码
size()
     |
     |--- Pass 1 & 2: Try without locks (optimistic)
     |    |--- sum all segment.count values
     |    |--- sum all segment.modCount values
     |    |--- if modCount sum unchanged between two passes → return count sum
     |    |    (no structural modifications happened → count is accurate)
     |
     |--- Pass 3: Lock ALL segments (pessimistic fallback)
     |    |--- lock every segment
     |    |--- sum all counts
     |    |--- unlock every segment
     |    |--- return exact count

The optimistic approach avoids locking in the common case (no concurrent modifications during the two reads).

Concurrency Level

java 复制代码
new ConcurrentHashMap<>(initialCapacity, loadFactor, concurrencyLevel);
//                                                    ^^^^^^^^^^^^^^^^
//                                                    default = 16

concurrencyLevel determines the number of Segments. It's rounded up to the nearest power of 2. Maximum theoretical concurrency = number of Segments (16 by default).

复制代码
concurrencyLevel = 16 → 16 Segments → max 16 concurrent writers
concurrencyLevel = 1  → 1 Segment  → effectively a synchronized HashMap
concurrencyLevel = 64 → 64 Segments → max 64 concurrent writers (more memory)

Once created, the number of Segments is FIXED --- it never changes. Only the HashEntry[] table within each Segment grows.

Head Prepend vs Tail Append

JDK 7 prepends new nodes at the head. JDK 8+ appends at the tail.

复制代码
JDK 7 put("D"):
  Before: tab[i] → A → B → C
  After:  tab[i] → D → A → B → C    (head prepend)

JDK 8+ put("D"):
  Before: tab[i] → A → B → C
  After:  tab[i] → A → B → C → D    (tail append)

Why JDK 7 prepends:

  • In early JDK 7, HashEntry.next was final --- you couldn't append by mutating tail.next
  • The only option: create a new head node with next = oldFirst, then swap tab[index]
  • This guaranteed readers always saw a complete, immutable chain --- no partial state
  • Note: put() still traverses the full list O(n) to check for duplicate keys before inserting
  • The prepend itself is O(1), but the full put() is O(n) due to the duplicate check

Why JDK 8+ changed to append:

  • Must traverse anyway to check for duplicate keys
  • Already at the tail when no match found --- append is free
  • Preserves insertion order (useful for LinkedHashMap compatibility reasoning)

Limitations (Why JDK 8 Replaced This Design)

Limitation Impact
Fixed segment count Can't adapt to workload; too few = contention, too many = memory waste
Linked list only O(n) worst case per bin (hash collision attacks)
Per-segment resize Each segment resizes alone; no cooperative multi-thread transfer
No computeIfAbsent Must use error-prone putIfAbsent + get pattern
size() may lock all Pessimistic fallback locks entire map
Memory overhead Segment objects + ReentrantLock per segment

Connection to Other Docs

  • concurrenthashmap-internals.md --- Java 8+ version (Node\[\], per-bin CAS + synchronized, TreeBin)
  • concurrenthashmap-reference.md --- Java 7 vs 8+ comparison table
  • java-thread-internals.md --- ReentrantLock details

size() Deep Dive

modCount --- Structural Modification Counter

Each Segment has a modCount that tracks structural changes (entries added/removed):

java 复制代码
static final class Segment<K,V> extends ReentrantLock {
    int modCount;          // NOT volatile --- written under lock, read without lock
    volatile int count;    // number of entries
}
Operation modCount incremented? Why
put(newKey, v) Yes New entry added (structural change)
put(existingKey, v2) No Just value update, no structural change
remove(key) Yes Entry removed (structural change)
rehash() No Same entries, different layout

modCount is not volatile but is still visible to readers because the count volatile write that follows provides the memory barrier:

复制代码
Writer: modCount++ → count = newCount (volatile write, flushes modCount too)
Reader: count = seg.count (volatile read) → modCount = seg.modCount (piggybacks, sees latest)

size() Algorithm --- Three-Pass Strategy

java 复制代码
static final int RETRIES_BEFORE_LOCK = 2;

public int size() {
    long sum, last = 0;

    for (int retries = -1; ; retries++) {

        // Pass 3: lock all segments (pessimistic fallback)
        if (retries == RETRIES_BEFORE_LOCK) {
            for (Segment seg : segments) seg.lock();
        }

        sum = 0;
        long modSum = 0;
        for (Segment seg : segments) {
            sum += seg.count;        // volatile read
            modSum += seg.modCount;  // piggybacks on volatile read
        }

        // Stable? modCount sum unchanged since last pass
        if (modSum == last) {
            if (retries == RETRIES_BEFORE_LOCK) {
                for (Segment seg : segments) seg.unlock();
            }
            return (int) sum;
        }
        last = modSum;
    }
}

Execution Flow

复制代码
retries=-1 (Pass 1, no locks):
  sum = 5+3+7+2 = 17
  modSum = 10+4+8+1 = 23
  last(0) != modSum(23) → retry
  last = 23

retries=0 (Pass 2, no locks):
  sum = 5+3+7+2 = 17
  modSum = 10+4+8+1 = 23
  last(23) == modSum(23) → STABLE → return 17 ✓

--- OR if put() happened between passes ---

retries=0 (Pass 2, no locks):
  sum = 5+4+7+2 = 18          ← Segment[1] got a new entry
  modSum = 10+5+8+1 = 24      ← Segment[1].modCount incremented
  last(23) != modSum(24) → retry
  last = 24

retries=1 (Pass 3, ALL SEGMENTS LOCKED):
  lock all 16 segments         ← no one can modify anything
  sum = 5+4+7+2 = 18
  modSum = 10+5+8+1 = 24
  last(24) == modSum(24) → STABLE (guaranteed) → return 18 ✓
  unlock all

Why This Works

  • If modSum is identical across two consecutive passes, no put(newKey)/remove() happened in between
  • The count values read in the second pass are consistent with each other
  • Avoids locking in the common case (low-contention reads)
  • Falls back to locking all segments only after 2 failed optimistic attempts

Segment Selection --- Upper Bits

Segment index uses the UPPER bits of the hash, bin index uses the LOWER bits:

java 复制代码
int segmentShift = 28;    // 32 - 4 (for 16 segments)
int segmentMask  = 15;    // 16 - 1

int segmentIndex = (hash >>> segmentShift) & segmentMask;  // upper 4 bits
int binIndex     = (tab.length - 1) & hash;                // lower bits
复制代码
hash (32 bits):
  SSSS_XXXX_XXXX_XXXX_XXXX_XXXX_XXXX_BBBB
  ^^^^                                ^^^^
  segment (fixed, upper 4 bits)       bin (variable, depends on tab.length)

Upper and lower bits are independent --- resizing a segment's table (changing bin count) never affects which segment a key belongs to. The bin mask is computed at read time from segment.table.length, not stored globally.

复制代码
Segment[3] resizes 8 → 16 bins:
  Segment index: unchanged (upper bits)
  Bin index: recomputed with new mask (tab.length-1)
  
  Before: hash & 0x7 = 5  → tab[5]
  After:  hash & 0xF = 13 → tab[13]
相关推荐
云烟成雨TD1 小时前
Spring AI Alibaba 1.x 系列【69】Token 用量统计
java·人工智能·spring
JAVA9651 小时前
JAVA面试-并发篇 03-使用synchronized doublecheck实现单例有什么坑
java·单例模式·面试
在繁华处1 小时前
Java从零到熟练(四):面向对象基础
java·开发语言
小江的记录本3 小时前
【JVM虚拟机】堆内存分代模型:年轻代(Eden+Survivor)、老年代、元空间Metaspace(附《思维导图》+《面试高频考点清单》)
java·前端·jvm·后端·python·spring·面试
在繁华处3 小时前
Java从零到熟练(三):流程控制
java·开发语言·python
唐青枫3 小时前
Java Optional 实战指南:优雅处理空值与链式转换
java
一起学开源3 小时前
一文读懂 ReAct 范式:让 AI Agent 真正学会“思考+行动“
java·javascript·react.js·ecmascript·react·alibaba·智能体开发
逍遥德4 小时前
MQTT教程详解-04.SpringBoot集成MQTT(告别手动控制)
java·spring boot·物联网·中间件·iot·iotdb
语戚4 小时前
力扣 3161. 块放置查询:线段树解法(Java 实现)
java·算法·leetcode·面试·线段树·力扣·
我命由我123455 小时前
Android 开发问题:MlKitException: An internal error occurred during initialization.
android·java·java-ee·android jetpack·android-studio·androidx·android runtime