StampedLock (Java 8+)
Designed for extreme read-heavy workloads. Optimistic reads don't acquire any lock --- just a "stamp" (version number). Not based on AQS --- uses its own 64-bit long state with custom spin/park logic.
Key characteristics:
-
Not AQS-based --- standalone class with its own
volatile long state,WNodewait queue, and spin/park logic. AQS's 32-bitintcan't hold the 64-bit stamp+readers+WBIT layout, and AQS has no concept of optimistic reads. -
Not reentrant --- no owner tracking, no per-thread hold count. Reentrant
writeLock()deadlocks. ReentrantreadLock()deadlocks if a writer is waiting. -
No fair mode --- always unfair. Readers barge via CAS without checking the wait queue. Writers can be starved by a continuous stream of readers.
-
Not a subclass of Lock --- doesn't implement
java.util.concurrent.locks.Lock. Uses stamp-based API (long writeLock()/unlockWrite(stamp)) instead oflock()/unlock().AQS-based: Not AQS-based:
├── ReentrantLock ├── StampedLock (custom 64-bit state)
├── ReentrantReadWriteLock
├── Semaphore
├── CountDownLatch
└── CyclicBarrier (uses RL)
Document Structure:
- [Three Modes](#Three Modes) --- Optimistic read, read lock, write lock
- [State Layout](#State Layout) --- 64-bit state, constants, stamp semantics
- [Optimistic Read Internals](#Optimistic Read Internals) ---
tryOptimisticRead,validate,VarHandle.acquireFence() - [Write Lock Internals](#Write Lock Internals) ---
writeLock,unlockWrite, stamp versioning - [Read Lock Internals](#Read Lock Internals) ---
readLock,unlockRead, overflow handling - [Lock Conversion](#Lock Conversion) ---
tryConvertToWriteLock,tryConvertToReadLock - [Memory Ordering](#Memory Ordering) --- How stamps provide happens-before guarantees
- [StampedLock vs ReadWriteLock](#StampedLock vs ReadWriteLock) --- Feature and performance comparison
- Pitfalls --- Non-reentrant, validation rules, not AutoCloseable
- [When to Use StampedLock](#When to Use StampedLock) --- Decision guide
Three Modes
java
StampedLock sl = new StampedLock();
// 1. Optimistic Read --- NO lock, just a stamp
long stamp = sl.tryOptimisticRead();
int x = this.x;
int y = this.y;
if (sl.validate(stamp)) {
// NO writer → data is valid (zero lock cost)
} else {
stamp = sl.readLock();
try { x = this.x; y = this.y; }
finally { sl.unlockRead(stamp); }
}
// 2. Read Lock --- shared (like ReadWriteLock)
long stamp = sl.readLock();
try { /* read data */ }
finally { sl.unlockRead(stamp); }
// 3. Write Lock --- exclusive
long stamp = sl.writeLock();
try { /* modify data */ }
finally { sl.unlockWrite(stamp); }
State Layout (64-bit long)
Unlike AQS which uses a 32-bit int state, StampedLock uses its own 64-bit long state:
state (64 bits):
┌──────────────────────────────┬──────┬─────────┐
│ bits 8-63: version stamp │bit 7 │bits 0-6 │
│ (incremented on write unlock)│ WBIT │ readers │
└──────────────────────────────┴──────┴─────────┘
0 = no writer
1 = write-locked
Constants
java
private static final int LG_READERS = 7;
private static final long RUNIT = 1L; // one reader unit (value = 1)
private static final long WBIT = 1L << LG_READERS; // bit 7: write lock flag (0x80 = 128)
private static final long RBITS = WBIT - 1L; // bits 0-6: reader count mask (0x7F = 127)
private static final long RFULL = RBITS - 1L; // max readers in state bits = 126 (0x7E)
private static final long ABITS = RBITS | WBIT; // all lock bits (0xFF = bits 0-7)
private static final long SBITS = ~ABITS; // stamp bits (bits 8-63) --- version mask
private static final long ORIGIN = WBIT << 1; // initial state = 256 (0x100)
// bit 8 set --- avoids 0 as valid stamp
Stamp Semantics
A stamp is a snapshot of the state's version bits. It serves two purposes:
- Validation token --- optimistic reads compare their stamp against the current state to detect writer interference
- Unlock token ---
unlockRead(stamp)andunlockWrite(stamp)verify the stamp matches before releasing, catching misuse (wrong stamp =IllegalMonitorStateException)
Stamp value 0 always means failure --- tryOptimisticRead() returns 0 if a writer holds the lock, tryConvertToWriteLock() returns 0 if conversion fails. This is why the initial state is ORIGIN (not 0) --- so the first valid stamp is non-zero.
State Transitions
Initial: state = ORIGIN (0x100) stamp=1, readers=0, write=0
Write locked: state = ORIGIN | WBIT (0x180) stamp=1, readers=0, write=1
1 reader: state = ORIGIN + RUNIT (0x101) stamp=1, readers=1, write=0
3 readers: state = ORIGIN + 3*RUNIT (0x103) stamp=1, readers=3, write=0
After 1 write: state = 0x200 stamp=2, readers=0, write=0
After 2 writes: state = 0x300 stamp=3, readers=0, write=0
Lock conversions (single CAS):
Read→Write (sole reader): 0x101 → 0x180 CAS(s, s - RUNIT + WBIT)
Write→Read: 0x180 → 0x201 state = (s + WBIT) | RUNIT
(stamp advances + 1 reader added)
Read→Write (2 readers): 0x102 → FAILS s & ABITS ≠ RUNIT → return 0
Reader Overflow
Bits 0-6 hold up to 126 readers (RFULL). 7 bits can represent 0-127, but 127 (RBITS = 0x7F) is the overflow trigger --- when reader count reaches 127, StampedLock switches to a separate int readerOverflow counter:
java
// In readLock acquisition, when reader bits are full:
if ((s & ABITS) < RFULL) {
// Normal path: CAS state += RUNIT
} else {
// Overflow path: state reader bits stay at RFULL
// Increment readerOverflow counter instead
readerOverflow++;
}
On unlock, if readerOverflow > 0, it decrements the overflow counter instead of the state bits. This is rare in practice --- 126 concurrent readers is unusual.
The overflow CAS trick: readerOverflow is a plain int (not volatile). Access is serialized by a CAS-based spinlock on the state bits:
java
private long tryIncReaderOverflow(long s) {
if ((s & ABITS) == RFULL) { // reader bits == 0x7E (126)
if (CAS(state, s, s | RBITS)) { // CAS: 0x7E → 0x7F (set to 127)
++readerOverflow; // safe: only CAS winner reaches here
state = s; // restore back to 0x7E
return s;
}
}
return 0L;
}
The value 0x7F (127 = RBITS) acts as a temporary lock --- while reader bits are 0x7F, any other thread's CAS fails (they expect 0x7E but see 0x7F). Once the winner restores to 0x7E, the next thread can succeed.
Stamp Only Changes on Write Unlock
Read operations (readLock, unlockRead) only modify bits 0-6 (reader count). The stamp bits (8-63) are unchanged. Only unlockWrite() advances the stamp via the stamp += WBIT carry.
This is why optimistic reads work alongside concurrent readers --- validate(stamp) checks state & SBITS, which readers don't touch:
Thread-0 (optimistic): Thread-1 (reader):
stamp = tryOptimisticRead()
→ stamp = state & SBITS = 0x100
x = this.x
readLock() → state += RUNIT (bits 0-6 change)
unlockRead() → state -= RUNIT (bits 0-6 change)
validate(0x100)
→ state & SBITS still = 0x100
→ VALID ✓ (reader didn't change stamp)
Optimistic Read Internals
The key insight: optimistic read does NO atomic write operations. It's just two volatile reads:
java
public long tryOptimisticRead() {
long s = state; // volatile read
return (s & WBIT) == 0L ? (s & SBITS) : 0L; // return stamp if not write-locked, else 0
}
public boolean validate(long stamp) {
VarHandle.acquireFence(); // memory barrier (see below)
return (state & SBITS) == stamp; // volatile read --- version unchanged?
}
VarHandle.acquireFence() --- Why It's Needed
Without the fence, the CPU or compiler could reorder the data reads after the validate() check:
// Without fence --- BROKEN (possible reordering)
stamp = tryOptimisticRead(); // volatile read of state
x = this.x; // ← CPU might move this AFTER validate
y = this.y; // ← CPU might move this AFTER validate
validate(stamp); // volatile read of state
// If x,y reads moved after validate, they could see a writer's partial update
// even though validate returned true
VarHandle.acquireFence() prevents this: it ensures all reads before the fence are completed before any reads after the fence. This guarantees that this.x and this.y were read before the state check in validate().
Note: This is an acquire fence, not a full fence. It only prevents read-after-read reordering, which is sufficient here. On x86 (TSO memory model), this is essentially a no-op at the hardware level --- x86 doesn't reorder reads. On ARM/AARCH64 (weaker memory model), this compiles to an actual barrier instruction.
Timeline --- Success Case (No Writer)
Thread-0 (optimistic read): Thread-1 (idle):
stamp = tryOptimisticRead()
→ state = 0x102, stamp = 0x100
x = this.x; → 5
y = this.y; → 10
validate(0x100)
→ acquireFence()
→ state & SBITS = 0x100
→ 0x100 == 0x100 → VALID ✓
use(x, y); // safe --- no writer intervened
Important:
validate() == trueguarantees the data was a consistent snapshot at read time --- no writer completed a write cycle during the window. It does NOT guarantee the data is still current aftervalidate()returns. A writer could start writing the instant after. This is fine because you already copied the values into local variables beforevalidate(). If you need the data to remain current (not just consistent), usereadLock()instead.
Timeline --- Failure Case (Writer Intervened)
Thread-0 (optimistic read): Thread-1 (writer):
stamp = tryOptimisticRead()
→ state = 0x102, stamp = 0x100
x = this.x; → 5
y = this.y; → 10 writeLock() → state |= WBIT → 0x103
this.x = 99; this.y = 200;
unlockWrite() → state = 0x202 (stamp incremented)
validate(0x100)
→ acquireFence()
→ state & SBITS = 0x200
→ 0x200 != 0x100 → INVALID ✗
// Fallback to read lock
stamp = sl.readLock();
try { x = this.x; y = this.y; } // re-read under lock --- guaranteed fresh
finally { sl.unlockRead(stamp); }
Cost Comparison
| Operation | Mechanism | Cost |
|---|---|---|
ReadWriteLock.readLock() |
CAS(state, c, c + SHARED_UNIT) | ~20-40 ns |
StampedLock.readLock() |
CAS(state, s, s + RUNIT) | ~15-30 ns |
StampedLock.tryOptimisticRead() |
volatile read (no CAS) | ~2-4 ns |
StampedLock.validate() |
acquire fence + volatile read | ~1-2 ns |
Optimistic read is ~10x faster than any CAS-based lock for the common case (no concurrent writer). And because it doesn't write to the state field, it doesn't cause cache line bouncing across cores (see read-write-lock.md --- Cache Line Bouncing).
Write Lock Internals
writeLock()
java
public long writeLock() {
long s, next;
// Fast path: no locks held at all → CAS to set WBIT
if (((s = state) & ABITS) == 0L &&
CAS(state, s, next = s + WBIT))
return next; // stamp = state with WBIT set
// Slow path: spin, then park
return acquireWrite(false, 0L);
}
The fast path succeeds only when state & ABITS == 0 --- no readers and no writer. A single CAS sets bit 7 (WBIT). The returned stamp includes the WBIT, which unlockWrite will check.
The slow path (acquireWrite) spins for a while (adaptive spin count), then parks the thread. Unlike AQS, StampedLock uses its own wait queue (a linked list of WNode objects, not AQS Node).
unlockWrite(stamp)
java
public void unlockWrite(long stamp) {
if (state != stamp || (stamp & WBIT) == 0L)
throw new IllegalMonitorStateException();
// Increment version: clear WBIT, advance stamp bits
// state = (stamp += WBIT) == 0L ? ORIGIN : stamp
long next = (stamp += WBIT) == 0L ? ORIGIN : stamp;
state = next; // volatile write --- no CAS needed (exclusive)
// Wake next waiter
releaseWaiter(next);
}
Key detail: stamp += WBIT adds 0x80 to a state where bit 7 (WBIT) is set. The carry propagates into bit 8 (stamp region), simultaneously clearing the WBIT and incrementing the version. If the addition overflows to 0, it resets to ORIGIN to avoid 0 as a valid stamp.
Before unlockWrite: state = 0x180 (stamp=0x100, WBIT=0x80, readers=0)
stamp += WBIT: 0x180 + 0x80 = 0x200
→ bit 7 (WBIT) cleared by carry into bit 8
→ stamp bits advanced from 0x100 to 0x200
After: state = 0x200 (new stamp, write=0, readers=0)
No CAS needed --- the current thread holds the exclusive write lock.
Read Lock Internals
readLock()
java
public long readLock() {
long s, next;
// Fast path: no writer, reader count not full → CAS to add RUNIT
if ((s = state) & ABITS) < RFULL &&
CAS(state, s, next = s + RUNIT))
return next;
// Slow path: spin, then park
return acquireRead(false, 0L);
}
The fast path CAS-es state += RUNIT (adds 1, incrementing the reader count in bits 0-6). Fails if a writer holds the lock (WBIT set) or reader count is at RFULL (126).
The check (s & ABITS) < RFULL rejects both cases in one comparison: since WBIT = 0x80 = 128 > RFULL = 126, any write-locked state automatically exceeds the threshold. No separate WBIT check needed.
The returned stamp includes the reader bits --- unlockRead will verify it.
unlockRead(stamp)
java
public void unlockRead(long stamp) {
long s, m;
while ((s = state) != 0L && (s & ABITS) != 0L && ((m = s & RBITS) != 0L)) {
if (m < RFULL) {
// Normal path: CAS state -= RUNIT
if (CAS(state, s, s - RUNIT)) {
if (m == RUNIT) // was last reader
releaseWaiter(s); // wake waiting writer
return;
}
} else {
// Overflow path: decrement readerOverflow counter
tryDecReaderOverflow(s);
return;
}
}
throw new IllegalMonitorStateException();
}
Uses a CAS loop (multiple readers may be unlocking concurrently). When the last reader releases (m == RUNIT), it wakes the next waiting writer.
Lock Conversion
StampedLock supports atomic lock conversion --- upgrade or downgrade without releasing:
java
long stamp = sl.readLock();
try {
if (needsUpdate()) {
long ws = sl.tryConvertToWriteLock(stamp);
if (ws != 0) {
stamp = ws; // upgrade succeeded
updateData();
} else {
sl.unlockRead(stamp);
stamp = sl.writeLock(); // fallback: release + reacquire
updateData();
}
}
} finally {
sl.unlock(stamp); // generic unlock --- works for any mode
}
tryConvertToWriteLock Internals
java
public long tryConvertToWriteLock(long stamp) {
long a = stamp & ABITS;
long s = state;
// Already write-locked by me? Return same stamp
if ((s & SBITS) == (stamp & SBITS)) {
if (a == WBIT) return stamp; // already write
if (a == RUNIT && (s & ABITS) == RUNIT) {
// Only reader → CAS to write lock
if (CAS(state, s, s - RUNIT + WBIT))
return s - RUNIT + WBIT;
}
}
return 0L; // conversion failed
}
Conversion succeeds only if this thread is the sole reader. If other readers exist, it fails (returns 0) --- you must release and reacquire.
tryConvertToReadLock --- Downgrade
java
public long tryConvertToReadLock(long stamp) {
long a = stamp & ABITS;
long s = state;
if ((s & SBITS) == (stamp & SBITS)) {
if (a == WBIT) {
// Write → Read: clear WBIT, add RUNIT, advance stamp
long next = (s += WBIT) | RUNIT; // increment version + set 1 reader
state = next; // no CAS --- we hold write lock
return next;
}
if (a != 0L) return stamp; // already read-locked --- return same stamp
// Optimistic → Read: CAS to add RUNIT
if ((s & ABITS) < RFULL && CAS(state, s, s + RUNIT))
return s + RUNIT;
}
return 0L; // conversion failed
}
Write-to-read conversion always succeeds (you hold exclusive access). Optimistic-to-read conversion requires a CAS (you don't hold any lock). Read-to-read is a no-op.
Conversion Matrix
| From | To | Method | Succeeds when |
|---|---|---|---|
| Optimistic | Read | tryConvertToReadLock(stamp) |
Stamp valid + CAS succeeds |
| Optimistic | Write | tryConvertToWriteLock(stamp) |
Stamp valid + no locks held + CAS |
| Read | Write | tryConvertToWriteLock(stamp) |
Sole reader (no other readers) |
| Write | Read | tryConvertToReadLock(stamp) |
Always (already exclusive) |
| Read | Read | tryConvertToReadLock(stamp) |
Always (no-op, returns same stamp) |
| Write | Write | tryConvertToWriteLock(stamp) |
Always (no-op, returns same stamp) |
Memory Ordering
StampedLock provides happens-before guarantees through volatile reads/writes of the state field:
unlockWrite()performs a volatile write tostate→ happens-before any subsequentreadLock(),writeLock(), or successfulvalidate()by another threadunlockRead()performs a CAS onstate(volatile read+write) → happens-before a subsequentwriteLock()by another threadvalidate()performs a volatile read ofstatepreceded byVarHandle.acquireFence()→ ensures all data reads beforevalidate()are ordered before the state check
Optimistic reads do NOT provide mutual exclusion , but they do provide visibility: if validate() returns true, all reads between tryOptimisticRead() and validate() saw a consistent snapshot --- no writer modified the data during that window.
StampedLock vs ReadWriteLock
| Feature | ReadWriteLock | StampedLock |
|---|---|---|
| Based on | AQS (32-bit int) | Custom (64-bit long) |
| Read cost (no contention) | ~20-40 ns (CAS) | ~2-4 ns (optimistic) |
| Optimistic reads | No | Yes |
| Reentrant | Yes | No |
| Condition support | Yes (write lock) | No |
| Lock upgrade (read → write) | No (deadlocks) | tryConvertToWriteLock (sole reader only) |
| Lock downgrade (write → read) | Yes | tryConvertToReadLock (always succeeds) |
| Max readers | 65535 (16 bits) | 126 in state + overflow counter |
| Fair mode | Yes | No |
| Wait queue | AQS CLH queue (Node) | Custom linked list (WNode) |
| Stamp validation | N/A | validate(stamp) |
| Cache line bouncing on read | Yes (CAS writes state) | No (optimistic = volatile read only) |
Pitfalls
1. Not Reentrant --- Deadlock Risk
StampedLock has no owner tracking (exclusiveOwnerThread) and no per-thread hold count. It doesn't know who holds the lock --- just how many readers and whether a writer is active.
Write lock reentry --- always deadlocks:
java
long stamp = sl.writeLock();
long stamp2 = sl.writeLock(); // DEADLOCK --- blocks forever
// writeLock() sees WBIT set → CAS fails → slow path → parks → no one will release
Read lock reentry --- deadlocks if a writer is waiting:
java
long stamp = sl.readLock();
// A writer arrives and parks (waiting for readers to release)
long stamp2 = sl.readLock(); // DEADLOCK
// slow path parks behind writer → writer waits for us → we wait for writer
The second readLock() may succeed if no writer is waiting (the fast path CAS just increments the reader count). But if a writer is queued, the slow path parks the reader behind the writer --- deadlock.
Why no reentrancy? Adding owner tracking would require a ThreadLocal or exclusiveOwnerThread field --- overhead that conflicts with StampedLock's "maximum performance" design goal. If you need reentrancy, use ReentrantReadWriteLock.
2. Always Validate + Fallback for Optimistic Reads
Values read before validate() may be garbage (partially written by a concurrent writer):
java
long stamp = sl.tryOptimisticRead();
int x = this.x; // might be stale or torn
int y = this.y; // might be inconsistent with x
if (!sl.validate(stamp)) {
// x and y are UNRELIABLE --- must re-read under lock
stamp = sl.readLock();
try { x = this.x; y = this.y; }
finally { sl.unlockRead(stamp); }
}
// Only use x, y AFTER successful validate or under lock
3. Don't Use Values Until validate() Succeeds
java
// WRONG --- using x before validation
long stamp = sl.tryOptimisticRead();
int x = this.x;
doSomethingWith(x); // x might be garbage!
sl.validate(stamp);
// RIGHT --- validate first, then use
long stamp = sl.tryOptimisticRead();
int x = this.x;
int y = this.y;
if (sl.validate(stamp)) {
doSomethingWith(x, y); // safe
}
4. Not AutoCloseable --- Must Use try-finally
Stamps are long values, not objects. Can't use try-with-resources:
java
// Can't do this:
try (var lock = sl.readLock()) { } // WRONG --- readLock() returns long
// Must do this:
long stamp = sl.readLock();
try { /* ... */ }
finally { sl.unlockRead(stamp); }
5. Wrong Stamp = IllegalMonitorStateException
Each unlock method validates the stamp. Using the wrong stamp (e.g., a read stamp with unlockWrite) throws:
java
long stamp = sl.readLock();
sl.unlockWrite(stamp); // IllegalMonitorStateException --- stamp doesn't have WBIT
// Use sl.unlock(stamp) if you're unsure which mode --- it checks the stamp bits
6. Optimistic Reads Can See Torn Values
On 32-bit JVMs, long and double reads are not atomic. An optimistic read could see a partially-written 64-bit value. validate() catches this (the stamp will have changed), but the garbage value could cause issues if used before validation (e.g., as an array index → ArrayIndexOutOfBoundsException). Always validate before using any optimistically-read value.
When to Use StampedLock
Use StampedLock when:
├── Read-heavy workload (>90% reads)
├── Reads are short (few field accesses)
├── Don't need reentrancy
├── Don't need conditions
├── Don't need fairness (writer starvation is acceptable)
└── Performance is critical
Use ReadWriteLock when:
├── Need reentrancy
├── Need condition variables
├── Need fairness (writer starvation must be prevented)
├── Reads are long (complex computation under lock)
└── Simpler API preferred
Fairness Comparison
StampedLock RRWL (non-fair) RRWL (fair)
Reader fairness None Partial Full FIFO
Writer starvation Possible Mitigated Impossible
Read throughput Highest High Lower
StampedLock's readLock() fast path CAS-es without checking the wait queue --- a new reader barges in even if a writer is queued. ReentrantReadWriteLock non-fair mode at least checks apparentlyFirstQueuedIsExclusive() to yield to a writer at the head. Fair mode enforces strict FIFO.