StampedLock Mechanism

StampedLock (Java 8+)

Designed for extreme read-heavy workloads. Optimistic reads don't acquire any lock --- just a "stamp" (version number). Not based on AQS --- uses its own 64-bit long state with custom spin/park logic.

Key characteristics:

Not AQS-based --- standalone class with its own volatile long state, WNode wait queue, and spin/park logic. AQS's 32-bit int can't hold the 64-bit stamp+readers+WBIT layout, and AQS has no concept of optimistic reads.
Not reentrant --- no owner tracking, no per-thread hold count. Reentrant writeLock() deadlocks. Reentrant readLock() deadlocks if a writer is waiting.
No fair mode --- always unfair. Readers barge via CAS without checking the wait queue. Writers can be starved by a continuous stream of readers.
Not a subclass of Lock --- doesn't implement java.util.concurrent.locks.Lock. Uses stamp-based API (long writeLock() / unlockWrite(stamp)) instead of lock() / unlock().

AQS-based: Not AQS-based:
├── ReentrantLock ├── StampedLock (custom 64-bit state)
├── ReentrantReadWriteLock
├── Semaphore
├── CountDownLatch
└── CyclicBarrier (uses RL)

Document Structure:

[Three Modes](#Three Modes) --- Optimistic read, read lock, write lock
[State Layout](#State Layout) --- 64-bit state, constants, stamp semantics
[Optimistic Read Internals](#Optimistic Read Internals) --- tryOptimisticRead, validate, VarHandle.acquireFence()
[Write Lock Internals](#Write Lock Internals) --- writeLock, unlockWrite, stamp versioning
[Read Lock Internals](#Read Lock Internals) --- readLock, unlockRead, overflow handling
[Lock Conversion](#Lock Conversion) --- tryConvertToWriteLock, tryConvertToReadLock
[Memory Ordering](#Memory Ordering) --- How stamps provide happens-before guarantees
[StampedLock vs ReadWriteLock](#StampedLock vs ReadWriteLock) --- Feature and performance comparison
Pitfalls --- Non-reentrant, validation rules, not AutoCloseable
[When to Use StampedLock](#When to Use StampedLock) --- Decision guide

Three Modes

java 复制代码

StampedLock sl = new StampedLock();

// 1. Optimistic Read --- NO lock, just a stamp
long stamp = sl.tryOptimisticRead();
int x = this.x;
int y = this.y;
if (sl.validate(stamp)) {
    // NO writer → data is valid (zero lock cost)
} else {
    stamp = sl.readLock();
    try { x = this.x; y = this.y; }
    finally { sl.unlockRead(stamp); }
}

// 2. Read Lock --- shared (like ReadWriteLock)
long stamp = sl.readLock();
try { /* read data */ }
finally { sl.unlockRead(stamp); }

// 3. Write Lock --- exclusive
long stamp = sl.writeLock();
try { /* modify data */ }
finally { sl.unlockWrite(stamp); }

State Layout (64-bit long)

Unlike AQS which uses a 32-bit int state, StampedLock uses its own 64-bit long state:

复制代码

state (64 bits):
┌──────────────────────────────┬──────┬─────────┐
│ bits 8-63: version stamp     │bit 7 │bits 0-6 │
│ (incremented on write unlock)│ WBIT │ readers │
└──────────────────────────────┴──────┴─────────┘
                                 0 = no writer
                                 1 = write-locked

Constants

java 复制代码

private static final int  LG_READERS = 7;
private static final long RUNIT = 1L;                  // one reader unit (value = 1)
private static final long WBIT  = 1L << LG_READERS;    // bit 7: write lock flag (0x80 = 128)
private static final long RBITS = WBIT - 1L;            // bits 0-6: reader count mask (0x7F = 127)
private static final long RFULL = RBITS - 1L;           // max readers in state bits = 126 (0x7E)
private static final long ABITS = RBITS | WBIT;         // all lock bits (0xFF = bits 0-7)
private static final long SBITS = ~ABITS;               // stamp bits (bits 8-63) --- version mask

private static final long ORIGIN = WBIT << 1;           // initial state = 256 (0x100)
                                                         // bit 8 set --- avoids 0 as valid stamp

Stamp Semantics

A stamp is a snapshot of the state's version bits. It serves two purposes:

Validation token --- optimistic reads compare their stamp against the current state to detect writer interference
Unlock token --- unlockRead(stamp) and unlockWrite(stamp) verify the stamp matches before releasing, catching misuse (wrong stamp = IllegalMonitorStateException)

Stamp value 0 always means failure --- tryOptimisticRead() returns 0 if a writer holds the lock, tryConvertToWriteLock() returns 0 if conversion fails. This is why the initial state is ORIGIN (not 0) --- so the first valid stamp is non-zero.

State Transitions

复制代码

Initial:        state = ORIGIN (0x100)              stamp=1, readers=0, write=0
Write locked:   state = ORIGIN | WBIT (0x180)       stamp=1, readers=0, write=1
1 reader:       state = ORIGIN + RUNIT (0x101)      stamp=1, readers=1, write=0
3 readers:      state = ORIGIN + 3*RUNIT (0x103)    stamp=1, readers=3, write=0
After 1 write:  state = 0x200                       stamp=2, readers=0, write=0
After 2 writes: state = 0x300                       stamp=3, readers=0, write=0

Lock conversions (single CAS):
  Read→Write (sole reader):  0x101 → 0x180          CAS(s, s - RUNIT + WBIT)
  Write→Read:                0x180 → 0x201           state = (s + WBIT) | RUNIT
                                                     (stamp advances + 1 reader added)
  Read→Write (2 readers):    0x102 → FAILS           s & ABITS ≠ RUNIT → return 0

Reader Overflow

Bits 0-6 hold up to 126 readers (RFULL). 7 bits can represent 0-127, but 127 (RBITS = 0x7F) is the overflow trigger --- when reader count reaches 127, StampedLock switches to a separate int readerOverflow counter:

java 复制代码

// In readLock acquisition, when reader bits are full:
if ((s & ABITS) < RFULL) {
    // Normal path: CAS state += RUNIT
} else {
    // Overflow path: state reader bits stay at RFULL
    // Increment readerOverflow counter instead
    readerOverflow++;
}

On unlock, if readerOverflow > 0, it decrements the overflow counter instead of the state bits. This is rare in practice --- 126 concurrent readers is unusual.

The overflow CAS trick: readerOverflow is a plain int (not volatile). Access is serialized by a CAS-based spinlock on the state bits:

java 复制代码

private long tryIncReaderOverflow(long s) {
    if ((s & ABITS) == RFULL) {                    // reader bits == 0x7E (126)
        if (CAS(state, s, s | RBITS)) {            // CAS: 0x7E → 0x7F (set to 127)
            ++readerOverflow;                       // safe: only CAS winner reaches here
            state = s;                              // restore back to 0x7E
            return s;
        }
    }
    return 0L;
}

The value 0x7F (127 = RBITS) acts as a temporary lock --- while reader bits are 0x7F, any other thread's CAS fails (they expect 0x7E but see 0x7F). Once the winner restores to 0x7E, the next thread can succeed.

Stamp Only Changes on Write Unlock

Read operations (readLock, unlockRead) only modify bits 0-6 (reader count). The stamp bits (8-63) are unchanged. Only unlockWrite() advances the stamp via the stamp += WBIT carry.

This is why optimistic reads work alongside concurrent readers --- validate(stamp) checks state & SBITS, which readers don't touch:

复制代码

Thread-0 (optimistic):              Thread-1 (reader):
  stamp = tryOptimisticRead()
    → stamp = state & SBITS = 0x100
  x = this.x
                                      readLock()   → state += RUNIT (bits 0-6 change)
                                      unlockRead() → state -= RUNIT (bits 0-6 change)
  validate(0x100)
    → state & SBITS still = 0x100
    → VALID ✓  (reader didn't change stamp)

Optimistic Read Internals

The key insight: optimistic read does NO atomic write operations. It's just two volatile reads:

java 复制代码

public long tryOptimisticRead() {
    long s = state;                              // volatile read
    return (s & WBIT) == 0L ? (s & SBITS) : 0L; // return stamp if not write-locked, else 0
}

public boolean validate(long stamp) {
    VarHandle.acquireFence();                    // memory barrier (see below)
    return (state & SBITS) == stamp;             // volatile read --- version unchanged?
}

VarHandle.acquireFence() --- Why It's Needed

Without the fence, the CPU or compiler could reorder the data reads after the validate() check:

复制代码

// Without fence --- BROKEN (possible reordering)
stamp = tryOptimisticRead();   // volatile read of state
x = this.x;                    // ← CPU might move this AFTER validate
y = this.y;                    // ← CPU might move this AFTER validate
validate(stamp);               // volatile read of state
// If x,y reads moved after validate, they could see a writer's partial update
// even though validate returned true

VarHandle.acquireFence() prevents this: it ensures all reads before the fence are completed before any reads after the fence. This guarantees that this.x and this.y were read before the state check in validate().

Note: This is an acquire fence, not a full fence. It only prevents read-after-read reordering, which is sufficient here. On x86 (TSO memory model), this is essentially a no-op at the hardware level --- x86 doesn't reorder reads. On ARM/AARCH64 (weaker memory model), this compiles to an actual barrier instruction.

Timeline --- Success Case (No Writer)

复制代码

Thread-0 (optimistic read):          Thread-1 (idle):
  stamp = tryOptimisticRead()
    → state = 0x102, stamp = 0x100
  x = this.x;  → 5
  y = this.y;  → 10
  validate(0x100)
    → acquireFence()
    → state & SBITS = 0x100
    → 0x100 == 0x100 → VALID ✓
  use(x, y);  // safe --- no writer intervened

Important: validate() == true guarantees the data was a consistent snapshot at read time --- no writer completed a write cycle during the window. It does NOT guarantee the data is still current after validate() returns. A writer could start writing the instant after. This is fine because you already copied the values into local variables before validate(). If you need the data to remain current (not just consistent), use readLock() instead.

Timeline --- Failure Case (Writer Intervened)

复制代码

Thread-0 (optimistic read):          Thread-1 (writer):
  stamp = tryOptimisticRead()
    → state = 0x102, stamp = 0x100
  x = this.x;  → 5
  y = this.y;  → 10                  writeLock() → state |= WBIT → 0x103
                                      this.x = 99; this.y = 200;
                                      unlockWrite() → state = 0x202 (stamp incremented)
  validate(0x100)
    → acquireFence()
    → state & SBITS = 0x200
    → 0x200 != 0x100 → INVALID ✗
  // Fallback to read lock
  stamp = sl.readLock();
  try { x = this.x; y = this.y; }    // re-read under lock --- guaranteed fresh
  finally { sl.unlockRead(stamp); }

Cost Comparison

Operation	Mechanism	Cost
`ReadWriteLock.readLock()`	CAS(state, c, c + SHARED_UNIT)	~20-40 ns
`StampedLock.readLock()`	CAS(state, s, s + RUNIT)	~15-30 ns
`StampedLock.tryOptimisticRead()`	volatile read (no CAS)	~2-4 ns
`StampedLock.validate()`	acquire fence + volatile read	~1-2 ns

Optimistic read is ~10x faster than any CAS-based lock for the common case (no concurrent writer). And because it doesn't write to the state field, it doesn't cause cache line bouncing across cores (see read-write-lock.md --- Cache Line Bouncing).

Write Lock Internals

writeLock()

java 复制代码

public long writeLock() {
    long s, next;
    // Fast path: no locks held at all → CAS to set WBIT
    if (((s = state) & ABITS) == 0L &&
        CAS(state, s, next = s + WBIT))
        return next;                    // stamp = state with WBIT set
    // Slow path: spin, then park
    return acquireWrite(false, 0L);
}

The fast path succeeds only when state & ABITS == 0 --- no readers and no writer. A single CAS sets bit 7 (WBIT). The returned stamp includes the WBIT, which unlockWrite will check.

The slow path (acquireWrite) spins for a while (adaptive spin count), then parks the thread. Unlike AQS, StampedLock uses its own wait queue (a linked list of WNode objects, not AQS Node).

unlockWrite(stamp)

java 复制代码

public void unlockWrite(long stamp) {
    if (state != stamp || (stamp & WBIT) == 0L)
        throw new IllegalMonitorStateException();
    // Increment version: clear WBIT, advance stamp bits
    // state = (stamp += WBIT) == 0L ? ORIGIN : stamp
    long next = (stamp += WBIT) == 0L ? ORIGIN : stamp;
    state = next;                       // volatile write --- no CAS needed (exclusive)
    // Wake next waiter
    releaseWaiter(next);
}

Key detail: stamp += WBIT adds 0x80 to a state where bit 7 (WBIT) is set. The carry propagates into bit 8 (stamp region), simultaneously clearing the WBIT and incrementing the version. If the addition overflows to 0, it resets to ORIGIN to avoid 0 as a valid stamp.

复制代码

Before unlockWrite: state = 0x180  (stamp=0x100, WBIT=0x80, readers=0)
stamp += WBIT:      0x180 + 0x80 = 0x200
                    → bit 7 (WBIT) cleared by carry into bit 8
                    → stamp bits advanced from 0x100 to 0x200
After:              state = 0x200  (new stamp, write=0, readers=0)

No CAS needed --- the current thread holds the exclusive write lock.

Read Lock Internals

readLock()

java 复制代码

public long readLock() {
    long s, next;
    // Fast path: no writer, reader count not full → CAS to add RUNIT
    if ((s = state) & ABITS) < RFULL &&
        CAS(state, s, next = s + RUNIT))
        return next;
    // Slow path: spin, then park
    return acquireRead(false, 0L);
}

The fast path CAS-es state += RUNIT (adds 1, incrementing the reader count in bits 0-6). Fails if a writer holds the lock (WBIT set) or reader count is at RFULL (126).

The check (s & ABITS) < RFULL rejects both cases in one comparison: since WBIT = 0x80 = 128 > RFULL = 126, any write-locked state automatically exceeds the threshold. No separate WBIT check needed.

The returned stamp includes the reader bits --- unlockRead will verify it.

unlockRead(stamp)

java 复制代码

public void unlockRead(long stamp) {
    long s, m;
    while ((s = state) != 0L && (s & ABITS) != 0L && ((m = s & RBITS) != 0L)) {
        if (m < RFULL) {
            // Normal path: CAS state -= RUNIT
            if (CAS(state, s, s - RUNIT)) {
                if (m == RUNIT)          // was last reader
                    releaseWaiter(s);    // wake waiting writer
                return;
            }
        } else {
            // Overflow path: decrement readerOverflow counter
            tryDecReaderOverflow(s);
            return;
        }
    }
    throw new IllegalMonitorStateException();
}

Uses a CAS loop (multiple readers may be unlocking concurrently). When the last reader releases (m == RUNIT), it wakes the next waiting writer.

Lock Conversion

StampedLock supports atomic lock conversion --- upgrade or downgrade without releasing:

java 复制代码

long stamp = sl.readLock();
try {
    if (needsUpdate()) {
        long ws = sl.tryConvertToWriteLock(stamp);
        if (ws != 0) {
            stamp = ws;       // upgrade succeeded
            updateData();
        } else {
            sl.unlockRead(stamp);
            stamp = sl.writeLock();  // fallback: release + reacquire
            updateData();
        }
    }
} finally {
    sl.unlock(stamp);  // generic unlock --- works for any mode
}

tryConvertToWriteLock Internals

java 复制代码

public long tryConvertToWriteLock(long stamp) {
    long a = stamp & ABITS;
    long s = state;
    // Already write-locked by me? Return same stamp
    if ((s & SBITS) == (stamp & SBITS)) {
        if (a == WBIT) return stamp;           // already write
        if (a == RUNIT && (s & ABITS) == RUNIT) {
            // Only reader → CAS to write lock
            if (CAS(state, s, s - RUNIT + WBIT))
                return s - RUNIT + WBIT;
        }
    }
    return 0L;  // conversion failed
}

Conversion succeeds only if this thread is the sole reader. If other readers exist, it fails (returns 0) --- you must release and reacquire.

tryConvertToReadLock --- Downgrade

java 复制代码

public long tryConvertToReadLock(long stamp) {
    long a = stamp & ABITS;
    long s = state;
    if ((s & SBITS) == (stamp & SBITS)) {
        if (a == WBIT) {
            // Write → Read: clear WBIT, add RUNIT, advance stamp
            long next = (s += WBIT) | RUNIT;  // increment version + set 1 reader
            state = next;                       // no CAS --- we hold write lock
            return next;
        }
        if (a != 0L) return stamp;             // already read-locked --- return same stamp
        // Optimistic → Read: CAS to add RUNIT
        if ((s & ABITS) < RFULL && CAS(state, s, s + RUNIT))
            return s + RUNIT;
    }
    return 0L;  // conversion failed
}

Write-to-read conversion always succeeds (you hold exclusive access). Optimistic-to-read conversion requires a CAS (you don't hold any lock). Read-to-read is a no-op.

Conversion Matrix

From	To	Method	Succeeds when
Optimistic	Read	`tryConvertToReadLock(stamp)`	Stamp valid + CAS succeeds
Optimistic	Write	`tryConvertToWriteLock(stamp)`	Stamp valid + no locks held + CAS
Read	Write	`tryConvertToWriteLock(stamp)`	Sole reader (no other readers)
Write	Read	`tryConvertToReadLock(stamp)`	Always (already exclusive)
Read	Read	`tryConvertToReadLock(stamp)`	Always (no-op, returns same stamp)
Write	Write	`tryConvertToWriteLock(stamp)`	Always (no-op, returns same stamp)

Memory Ordering

StampedLock provides happens-before guarantees through volatile reads/writes of the state field:

unlockWrite() performs a volatile write to state → happens-before any subsequent readLock(), writeLock(), or successful validate() by another thread
unlockRead() performs a CAS on state (volatile read+write) → happens-before a subsequent writeLock() by another thread
validate() performs a volatile read of state preceded by VarHandle.acquireFence() → ensures all data reads before validate() are ordered before the state check

Optimistic reads do NOT provide mutual exclusion , but they do provide visibility: if validate() returns true, all reads between tryOptimisticRead() and validate() saw a consistent snapshot --- no writer modified the data during that window.

StampedLock vs ReadWriteLock

Feature	ReadWriteLock	StampedLock
Based on	AQS (32-bit int)	Custom (64-bit long)
Read cost (no contention)	~20-40 ns (CAS)	~2-4 ns (optimistic)
Optimistic reads	No	Yes
Reentrant	Yes	No
Condition support	Yes (write lock)	No
Lock upgrade (read → write)	No (deadlocks)	`tryConvertToWriteLock` (sole reader only)
Lock downgrade (write → read)	Yes	`tryConvertToReadLock` (always succeeds)
Max readers	65535 (16 bits)	126 in state + overflow counter
Fair mode	Yes	No
Wait queue	AQS CLH queue (Node)	Custom linked list (WNode)
Stamp validation	N/A	`validate(stamp)`
Cache line bouncing on read	Yes (CAS writes state)	No (optimistic = volatile read only)

Pitfalls

1. Not Reentrant --- Deadlock Risk

StampedLock has no owner tracking (exclusiveOwnerThread) and no per-thread hold count. It doesn't know who holds the lock --- just how many readers and whether a writer is active.

Write lock reentry --- always deadlocks:

java 复制代码

long stamp = sl.writeLock();
long stamp2 = sl.writeLock();  // DEADLOCK --- blocks forever
// writeLock() sees WBIT set → CAS fails → slow path → parks → no one will release

Read lock reentry --- deadlocks if a writer is waiting:

java 复制代码

long stamp = sl.readLock();
// A writer arrives and parks (waiting for readers to release)
long stamp2 = sl.readLock();  // DEADLOCK
// slow path parks behind writer → writer waits for us → we wait for writer

The second readLock() may succeed if no writer is waiting (the fast path CAS just increments the reader count). But if a writer is queued, the slow path parks the reader behind the writer --- deadlock.

Why no reentrancy? Adding owner tracking would require a ThreadLocal or exclusiveOwnerThread field --- overhead that conflicts with StampedLock's "maximum performance" design goal. If you need reentrancy, use ReentrantReadWriteLock.

2. Always Validate + Fallback for Optimistic Reads

Values read before validate() may be garbage (partially written by a concurrent writer):

java 复制代码

long stamp = sl.tryOptimisticRead();
int x = this.x;  // might be stale or torn
int y = this.y;  // might be inconsistent with x
if (!sl.validate(stamp)) {
    // x and y are UNRELIABLE --- must re-read under lock
    stamp = sl.readLock();
    try { x = this.x; y = this.y; }
    finally { sl.unlockRead(stamp); }
}
// Only use x, y AFTER successful validate or under lock

3. Don't Use Values Until validate() Succeeds

java 复制代码

// WRONG --- using x before validation
long stamp = sl.tryOptimisticRead();
int x = this.x;
doSomethingWith(x);  // x might be garbage!
sl.validate(stamp);

// RIGHT --- validate first, then use
long stamp = sl.tryOptimisticRead();
int x = this.x;
int y = this.y;
if (sl.validate(stamp)) {
    doSomethingWith(x, y);  // safe
}

4. Not AutoCloseable --- Must Use try-finally

Stamps are long values, not objects. Can't use try-with-resources:

java 复制代码

// Can't do this:
try (var lock = sl.readLock()) { }  // WRONG --- readLock() returns long

// Must do this:
long stamp = sl.readLock();
try { /* ... */ }
finally { sl.unlockRead(stamp); }

5. Wrong Stamp = IllegalMonitorStateException

Each unlock method validates the stamp. Using the wrong stamp (e.g., a read stamp with unlockWrite) throws:

java 复制代码

long stamp = sl.readLock();
sl.unlockWrite(stamp);  // IllegalMonitorStateException --- stamp doesn't have WBIT

// Use sl.unlock(stamp) if you're unsure which mode --- it checks the stamp bits

6. Optimistic Reads Can See Torn Values

On 32-bit JVMs, long and double reads are not atomic. An optimistic read could see a partially-written 64-bit value. validate() catches this (the stamp will have changed), but the garbage value could cause issues if used before validation (e.g., as an array index → ArrayIndexOutOfBoundsException). Always validate before using any optimistically-read value.

When to Use StampedLock

复制代码

Use StampedLock when:
├── Read-heavy workload (>90% reads)
├── Reads are short (few field accesses)
├── Don't need reentrancy
├── Don't need conditions
├── Don't need fairness (writer starvation is acceptable)
└── Performance is critical

Use ReadWriteLock when:
├── Need reentrancy
├── Need condition variables
├── Need fairness (writer starvation must be prevented)
├── Reads are long (complex computation under lock)
└── Simpler API preferred

Fairness Comparison

复制代码

                    StampedLock    RRWL (non-fair)     RRWL (fair)
Reader fairness     None           Partial             Full FIFO
Writer starvation   Possible       Mitigated            Impossible
Read throughput     Highest        High                 Lower

StampedLock's readLock() fast path CAS-es without checking the wait queue --- a new reader barges in even if a writer is queued. ReentrantReadWriteLock non-fair mode at least checks apparentlyFirstQueuedIsExclusive() to yield to a writer at the head. Fair mode enforces strict FIFO.