Guava Cache Deep Dive

1. Overview

Guava Cache is Google Guava's in-process local cache, built on JDK7 ConcurrentHashMap's segment-lock design. It provides auto-loading, size limits, expiration, and reference-based eviction. No longer evolving --- Google recommends Caffeine.

Core positioning: In-process synchronous local cache, favoring feature completeness over extreme performance.

Architecture (one sentence) : LoadingCache → LocalCache → Segment[] (each a ReentrantLock with its own hash table, write queue, access queue, and recency queue).

复制代码

┌─────────────────────────────────────────────────┐
│                  LoadingCache                     │
├─────────────────────────────────────────────────┤
│  CacheBuilder (Configuration)                    │
│    ├── maximumSize / maximumWeight               │
│    ├── expireAfterWrite / expireAfterAccess      │
│    ├── refreshAfterWrite                         │
│    ├── concurrencyLevel (segments, default 4)    │
│    ├── weakKeys / weakValues / softValues        │
│    └── removalListener                           │
├─────────────────────────────────────────────────┤
│  LocalCache (Core impl, extends AbstractMap)     │
│    ├── Segment[] segments (2^n segments)         │
│    │     ├── AtomicReferenceArray<E> table       │
│    │     ├── writeQueue (DLL, write-order)       │
│    │     ├── accessQueue (DLL, access-order/LRU) │
│    │     ├── recencyQueue (lock-free CAS buffer) │
│    │     └── ReferenceEntry (linked list node)   │
│    ├── CacheLoader (auto-load on miss)           │
│    └── RemovalListener (eviction callback)       │
└─────────────────────────────────────────────────┘

Part I: API & Usage

2. CacheLoader Pattern

CacheLoader is the core abstraction for auto-populating a LoadingCache. On cache miss, the loader fetches the value automatically --- only one thread loads per key.

java 复制代码

LoadingCache<String, List<WidgetInfo>> cache = CacheBuilder.newBuilder()
    .maximumSize(1000)
    .expireAfterWrite(10, TimeUnit.MINUTES)
    .build(new CacheLoader<String, List<WidgetInfo>>() {
        @Override
        public List<WidgetInfo> load(String key) {
            return dao.queryWidgets(key); // Called on cache miss
        }
    });

// Usage --- automatically calls load() on miss
List<WidgetInfo> widgets = cache.get(templateKey);

`get(key)` → `load(key)` Flow

复制代码

caller calls cache.get("templateKey")
        │
        ▼
   ┌─────────────┐
   │ Cache lookup │
   └─────┬───────┘
         │
    ┌────┴────┐
    │ Hit?    │
    ├── Yes ──┼──► Return cached value immediately
    │         │
    └── No ───┘
         │
         ▼
   ┌──────────────────┐
   │ CacheLoader.load │  ← Called automatically on miss
   │   (key)          │    Only ONE thread loads per key
   └────────┬─────────┘    (other threads wait)
            │
            ▼
   ┌──────────────────┐
   │ Store in cache   │
   │ Return to caller │
   └──────────────────┘

`get` vs `getUnchecked`

Method	Checked Exception	Use When
`get(key)`	Throws `ExecutionException` --- caller must handle	`load()` can throw checked exceptions
`getUnchecked(key)`	Wraps in `UncheckedExecutionException`	`load()` only throws unchecked exceptions

java 复制代码

// get() --- must handle ExecutionException
try {
    List<WidgetInfo> widgets = cache.get(templateKey);
} catch (ExecutionException e) {
    log.error("Cache load failed for {}", templateKey, e);
}

// getUnchecked() --- cleaner when load() won't throw checked exceptions
List<WidgetInfo> widgets = cache.getUnchecked(templateKey);

Key behaviors:

load(key) is called once per missing key (thread-safe, no thundering herd)
loadAll(keys) for batch loading (override CacheLoader.loadAll())
refresh(key) reloads asynchronously without blocking reads

Writing Data to Cache

Three ways to store data:

java 复制代码

// 1. Automatic --- load() return value is cached on miss (most common)
cache.get("key");  // calls load("key") if absent, stores result

// 2. Manual put --- bypass CacheLoader, directly insert
cache.put("key", value);

// 3. Invalidate + re-fetch --- force reload
cache.invalidate("key");       // remove single entry
cache.refresh("key");          // async reload via load(), stale value served meanwhile
cache.invalidateAll();         // clear entire cache

Concurrency: One Thread Per Key

LoadingCache guarantees only one thread calls load() per key. This is built-in --- no extra config needed.

复制代码

Thread A: get("key1") → miss → acquires segment lock → calls load("key1")
Thread B: get("key1") → miss → lock held by A → BLOCKS (waits)
Thread C: get("key2") → miss → different segment → calls load("key2") concurrently

Thread A: load() returns → stores value → releases lock
Thread B: wakes up → sees cached value → returns (no load() call)

Internally, two-level locking prevents thundering herd:

java 复制代码

// Simplified pseudocode of LocalCache.get() (see Section 10 for full details)
V get(K key) {
    Segment segment = segments[hash(key) & segmentMask];
    
    V value = segment.getLiveEntry(key);     // lock-free fast path
    if (value != null) return value;
    
    segment.lock();                          // cache miss --- acquire segment lock
    try {
        value = segment.getLiveEntry(key);   // double-check after lock
        if (value != null) return value;
        
        // Install LoadingValueReference placeholder (marks "I'm the loader")
        entry.setValueReference(new LoadingValueReference<>());
    } finally {
        segment.unlock();                    // RELEASE segment lock BEFORE loading
    }
    
    // Loading runs under per-entry lock, NOT segment lock
    synchronized (entry) {
        value = loader.load(key);            // only this thread loads
        entry.setValue(value);
        return value;
    }
}

Why two levels? If load() ran under the segment lock, ALL other keys in the same segment would be blocked during loading (potentially hundreds of ms for a DB call). By releasing the segment lock first, other keys remain accessible. Only threads requesting the same key block (via LoadingValueReference.waitForValue()).

Tuning: CacheBuilder.newBuilder().concurrencyLevel(N) controls segment count (default 4). More segments = less contention, more memory.

3. Configuration Quick Reference

java 复制代码

CacheBuilder.newBuilder()
    .maximumSize(10_000)              // Size cap (mutually exclusive with maximumWeight)
    .expireAfterWrite(10, MINUTES)    // Hard TTL
    .expireAfterAccess(5, MINUTES)    // Idle TTL
    .refreshAfterWrite(1, MINUTES)    // Async refresh (override reload())
    .concurrencyLevel(16)             // Segment count (power of 2)
    .recordStats()                    // Enable hit/miss metrics
    .removalListener(RemovalListeners.asynchronous(listener, executor))
    .build(loader);

4. Refresh vs Expire

Dimension	expireAfterWrite	expireAfterAccess	refreshAfterWrite
Timer resets on	Write only	Any access (read or write)	Write only
Purpose	Data freshness (hard TTL)	Memory reclamation (idle eviction)	Non-blocking data freshness
Old value handling	Deleted, blocks on reload	Deleted, blocks on reload	Retained, async refresh
Concurrency	Multiple threads wait	Multiple threads wait	One triggers, others return old
Blocking	Yes	Yes	No

expireAfterWrite = data freshness guarantee. Entry dies after N minutes regardless of reads.

expireAfterAccess = memory strategy. Entry dies after N minutes of no access. Reads reset the timer.

refreshAfterWrite = non-blocking freshness. Old value served while refresh happens in background.

Both expire policies cause entry removal. With LoadingCache, removal triggers a reload on next get(). The difference is what resets the countdown.

Golden rule : Always configure both, with expireAfterWrite > refreshAfterWrite.

java 复制代码

.refreshAfterWrite(1, TimeUnit.MINUTES)    // Non-blocking refresh
.expireAfterWrite(10, TimeUnit.MINUTES)    // Hard expiry safety net

Common misuse : Configuring only refreshAfterWrite without expireAfterWrite means values are never deleted (only refreshed). If the loader consistently returns null or throws exceptions, old values persist forever.

When to Use `refreshAfterWrite`

Use when ALL of these are true:

Latency-sensitive hot path (P99 matters)
High-QPS keys (blocking would cause thread pile-up)
Slight staleness is acceptable (up to refreshAfterWrite duration)
Load is expensive (>50ms from DB/remote service)

When NOT to Use

Data must be strictly fresh (security tokens, permissions)
Low-QPS keys (blocking one thread is fine)
Loader can fail persistently (stale value persists forever without expire)

`expireAfterWrite` vs `expireAfterAccess`

	`expireAfterWrite`	`expireAfterAccess`
Timer resets on	Write (put/load) only	Any access (read or write)
Purpose	Data freshness guarantee	Memory reclamation for idle keys
Behavior after expiry	Entry removed, next `get()` blocks on `load()`	Same

expireAfterAccess is primarily a memory strategy (evict keys nobody reads). expireAfterWrite is a freshness strategy (guarantee max data age). Both cause entry removal; with LoadingCache, removal triggers reload on next access.

Async Refresh Example

java 复制代码

LoadingCache<K, V> cache = CacheBuilder.newBuilder()
    .refreshAfterWrite(1, TimeUnit.MINUTES)
    .build(new CacheLoader<K, V>() {
        @Override
        public V load(K key) { return loadFromDB(key); }

        // Override reload() for async refresh support
        @Override
        public ListenableFuture<V> reload(K key, V oldValue) {
            return executor.submit(() -> loadFromDB(key));
        }
    });

5. Reference Types

Configuration	Key	Value	Reclamation Timing	Semantic Change
Default	Strong ref	Strong ref	Only via eviction policy	-
`weakKeys()`	Weak ref	Strong ref	When key has no strong refs, GC reclaims	equals becomes `==`
`weakValues()`	Strong ref	Weak ref	When value has no strong refs, GC reclaims	equals becomes `==`
`softValues()`	Strong ref	Soft ref	When JVM is low on memory	equals becomes `==`

Critical pitfall : Once weakKeys/weakValues/softValues is enabled, Guava uses identity comparison (==) instead of equals(). This means cache.get(new String("key")) may not find an entry stored with a string literal.

Production recommendation : Prefer deterministic eviction (maximumSize + expireAfterWrite) over GC-driven eviction (softValues).

java 复制代码

// Preferred: deterministic eviction
CacheBuilder.newBuilder()
    .maximumSize(10_000)
    .expireAfterWrite(10, TimeUnit.MINUTES)
    .build(loader);

// Avoid: unpredictable GC-driven eviction
CacheBuilder.newBuilder()
    .softValues()  // When does it evict? Depends on heap pressure.
    .build(loader);

6. RemovalListener

java 复制代码

Cache<K, V> cache = CacheBuilder.newBuilder()
    .removalListener(notification -> {
        // cause: SIZE / EXPIRED / EXPLICIT / REPLACED / COLLECTED
        log.info("Removed: {} cause={}", notification.getKey(), notification.getCause());
    })
    .build();

RemovalCause descriptions:

EXPLICIT: User explicitly called invalidate() or put() overwrote (old value)
REPLACED: put() replaced old value (same key)
COLLECTED: Weak/soft reference reclaimed by GC
EXPIRED: Expired
SIZE: Evicted by LRU

Execution timing pitfall : Listener executes synchronously while holding the Segment lock . If the listener is slow or blocks (e.g., HTTP request), it severely impacts cache throughput. Always wrap with RemovalListeners.asynchronous():

java 复制代码

.removalListener(RemovalListeners.asynchronous(myListener, executor))

7. Statistics

java 复制代码

Cache<K, V> cache = CacheBuilder.newBuilder()
    .recordStats()  // Must be enabled
    .build();

CacheStats stats = cache.stats();
stats.hitRate();            // Hit rate
stats.missRate();           // Miss rate
stats.hitCount();           // Hit count
stats.missCount();          // Miss count
stats.loadSuccessCount();   // Successful load count
stats.loadExceptionCount(); // Load exception count
stats.totalLoadTime();      // Total load time (ns)
stats.averageLoadPenalty(); // Average load time (ns)
stats.evictionCount();      // Eviction count

Note : recordStats() has some performance overhead (atomic counters). Benchmarks show ~5-10%, acceptable for production.

Where to check metrics : Guava has no built-in dashboard. Access via cache.stats() programmatically, then bridge to your observability stack (periodic logging, CloudWatch, JMX). Stats are cumulative since cache creation --- use current.minus(previous) for rate-based metrics over an interval.

8. Caching Alternatives Comparison

Approach	TTL	Max Size	Auto-Load	Thread-Safe	Eviction
`ConcurrentHashMap`	❌	❌	❌	✅	❌
`ConcurrentHashMap` + manual TTL	Manual	Manual	Manual	✅	Manual
Guava `Cache` (no loader)	✅	✅	❌	✅	LRU
Guava `LoadingCache`	✅	✅	✅	✅	LRU
Caffeine	✅	✅	✅	✅	W-TinyLFU

ConcurrentHashMap Comparison (Java 8+)

ConcurrentHashMap also guarantees one-thread-per-key via computeIfAbsent, but lacks TTL and eviction.

Since Java 8, ConcurrentHashMap replaced the old Segment[] lock-striping with a new structure:

复制代码

Java 7 (old):  Segment[16] → each Segment has its own HashEntry[] → linked list
Java 8+ (new): Node[] array → per-bin CAS + synchronized → linked list OR red-black tree

Java 8+ internal structure:

复制代码

ConcurrentHashMap
  └── Node[] table (flat array, no segments)
        ├── bin[0]: null
        ├── bin[1]: Node → Node → Node (linked list, ≤8 nodes)
        ├── bin[2]: TreeBin → TreeNode (red-black tree, >8 nodes)
        └── bin[N]: ...

Key changes from Java 7 → 8:

No more Segment[] --- locking is per-bin (synchronized on first node of each bin)
Linked list treeifies to red-black tree when bin depth > 8 (untreeifies when < 6)
get() is lock-free using volatile reads
computeIfAbsent() locks only the specific bin, not a segment

java 复制代码

ConcurrentHashMap<String, List<WidgetInfo>> map = new ConcurrentHashMap<>();
map.computeIfAbsent("key", k -> dao.queryWidgets(k));  // one thread loads per key

Note: Guava's LoadingCache still uses the older segment-based design internally (its LocalCache predates Java 8). But the API behavior is the same --- one thread loads per key, others wait.

Part II: Internals

9. Data Structures

9.1 Segment

java 复制代码

// Similar to JDK7 ConcurrentHashMap's segment-lock design
// Each Segment itself is a ReentrantLock
static class Segment<K, V> extends ReentrantLock {
    // Hash table (hash slot array, each slot is a linked list head)
    volatile AtomicReferenceArray<ReferenceEntry<K, V>> table;

    // Element count in this segment
    volatile int count;

    // Write-order queue (for expireAfterWrite)
    final Queue<ReferenceEntry<K, V>> writeQueue;

    // Access-order queue (for expireAfterAccess and LRU eviction)
    final Queue<ReferenceEntry<K, V>> accessQueue;

    // Batch access event recording (recorded outside lock, batch-processed later)
    final Queue<ReferenceEntry<K, V>> recencyQueue;

    // Read operation counter (triggers cleanUp)
    final AtomicInteger readCount = new AtomicInteger();

    // Total weight (for maximumWeight)
    long totalWeight;
}

Segment count calculation : concurrencyLevel rounded up to the next power of 2, default 4. More segments = higher concurrency but more memory overhead. Rule of thumb: expected number of concurrent threads.

Why AtomicReferenceArray for the table? A plain ReferenceEntry[] has no per-element visibility guarantees across threads. AtomicReferenceArray gives per-slot volatile semantics --- table.get(i) always sees the latest table.set(i) from another thread. This enables lock-free reads on the hot path. Note: a volatile ReferenceEntry[] only makes the array reference volatile, not individual elements --- insufficient for a hash table.

9.2 ReferenceEntry Node

java 复制代码

interface ReferenceEntry<K, V> {
    ValueReference<K, V> getValueReference();
    int getHash();
    ReferenceEntry<K, V> getNext();  // linked list within hash bucket

    // Access time (expireAfterAccess)
    long getAccessTime();
    void setAccessTime(long time);

    // Write time (expireAfterWrite)
    long getWriteTime();
    void setWriteTime(long time);

    // accessQueue doubly-linked list pointers
    ReferenceEntry<K, V> getNextInAccessQueue();
    ReferenceEntry<K, V> getPreviousInAccessQueue();
    // writeQueue doubly-linked list pointers
    ReferenceEntry<K, V> getNextInWriteQueue();
    ReferenceEntry<K, V> getPreviousInWriteQueue();
}

Note : Guava generates different Entry subclasses for different configuration combinations (whether access/write expiry is enabled, whether weak references are used, etc.) such as StrongAccessEntry, WeakWriteEntry, etc., to save memory on unused fields.

9.3 ValueReference

java 复制代码

interface ValueReference<K, V> {
    V get();                    // Get value (may be null)
    V waitForValue();           // Block waiting for load to complete
    int getWeight();            // Weight
    ReferenceEntry<K, V> getEntry();
    boolean isLoading();        // Whether currently loading
    boolean isActive();         // Whether still an active reference
}

// Implementations:
// - StrongValueReference     Strong reference (default)
// - SoftValueReference       Soft reference (reclaimed before OOM)
// - WeakValueReference       Weak reference (reclaimed on GC)
// - LoadingValueReference    Placeholder during loading (for synchronous waiting)

Structure	Purpose
`Segment`	Lock granularity unit; holds hash table + LRU queues
`ReferenceEntry`	Node in hash bucket linked list + doubly-linked in access/write queues
`ValueReference`	Wraps value; supports strong/weak/soft refs + loading placeholder
`recencyQueue`	Lock-free buffer for read access events (drained in batch to accessQueue)
`accessQueue`	Doubly-linked list ordered by access time (LRU eviction source)
`writeQueue`	Doubly-linked list ordered by write time (expireAfterWrite source)

10. Cache Loading & Stampede Prevention

java 复制代码

// Entry point: LocalCache.get()
V get(K key, CacheLoader<K, V> loader) {
    int hash = hash(key);
    Segment<K, V> segment = segmentFor(hash);
    return segment.get(key, hash, loader);
}

// Segment.get() simplified logic
V get(K key, int hash, CacheLoader<K, V> loader) {
    try {
        if (count != 0) {
            ReferenceEntry<K, V> e = getEntry(key, hash);
            if (e != null) {
                long now = ticker.read();
                V value = getLiveValue(e, now);  // Check expiration
                if (value != null) {
                    recordRead(e, now);           // Record access (into recencyQueue)
                    return scheduleRefresh(e, key, hash, value, now, loader);
                }
                // Hit but expired → if loading, wait
                ValueReference<K, V> valueRef = e.getValueReference();
                if (valueRef.isLoading()) {
                    return waitForLoadingValue(e, key, valueRef);
                }
            }
        }
        // Cache miss → lock and load
        return lockedGetOrLoad(key, hash, loader);
    } finally {
        postReadCleanup();  // Trigger cleanUp every 64 reads
    }
}

Concurrency Control (lockedGetOrLoad)

java 复制代码

V lockedGetOrLoad(K key, int hash, CacheLoader<K, V> loader) {
    ReferenceEntry<K, V> e;
    ValueReference<K, V> loadingValueReference = null;
    boolean createNewEntry = true;  // Default: this thread will be the loader

    lock();
    try {
        // Double-check: has another thread already loaded?
        e = getEntry(key, hash);
        if (e != null) {
            V value = getLiveValue(e, now);
            if (value != null) return value;

            // If another thread is loading, current thread waits
            ValueReference<K, V> valueRef = e.getValueReference();
            if (valueRef.isLoading()) {
                loadingValueReference = valueRef;
                createNewEntry = false;  // Another thread is already loading
            }
        }
        // createNewEntry remains true when:
        //   1. Entry doesn't exist (e == null) --- key never cached
        //   2. Entry exists but expired/dead AND no other thread is loading it

        if (createNewEntry) {
            loadingValueReference = new LoadingValueReference<>();
            if (e == null) {
                // Brand new key: create entry and insert into hash table
                e = newEntry(key, hash, first);
                e.setValueReference(loadingValueReference);
                table.set(index, e);  // Insert into hash table
            } else {
                // Existing expired entry: replace its value reference with loading placeholder
                e.setValueReference(loadingValueReference);
            }
        }
    } finally {
        unlock();
        postWriteCleanup();
    }

    // === KEY: Loading executes OUTSIDE the lock to avoid long lock holding ===
    if (createNewEntry) {
        // This thread is the designated loader
        synchronized (e) {
            return loadSync(key, hash, loadingValueReference, loader);
        }
    }
    // Other threads (createNewEntry == false): wait for LoadingValueReference to complete
    return waitForLoadingValue(e, key, loadingValueReference);
}

Key points:

Only one thread executes loader.load(); other threads block via LoadingValueReference.waitForValue(), preventing cache stampede.
Loading executes outside the lock to avoid blocking other threads for extended periods.
If loading fails, LoadingValueReference notifies all waiters to throw an exception.

11. Expiration Policy (Lazy Expiration)

java 复制代码

// Check if expired
boolean isExpired(ReferenceEntry<K, V> entry, long now) {
    if (expiresAfterAccess()
        && now - entry.getAccessTime() >= expireAfterAccessNanos) {
        return true;
    }
    if (expiresAfterWrite()
        && now - entry.getWriteTime() >= expireAfterWriteNanos) {
        return true;
    }
    return false;
}

Lazy expiration semantics:

Guava Cache has no background thread to clean expired entries.
Expired entries are cleaned on the next access or write operation (triggering cleanUp()).
If the cache has no access for a long time, expired entries will continue occupying memory.

For timely cleanup in production:

java 复制代码

scheduler.scheduleWithFixedDelay(cache::cleanUp, 1, 1, TimeUnit.MINUTES);

12. Eviction Strategy (Approximate LRU)

java 复制代码

// Size-based eviction
void evictEntries(ReferenceEntry<K, V> newest) {
    if (!evictsBySize()) return;

    drainRecencyQueue();  // First process accumulated access records

    while (totalWeight > maxSegmentWeight) {
        ReferenceEntry<K, V> e = getNextEvictable();  // Head of accessQueue
        if (e == newest) continue;  // Don't evict the just-inserted entry
        if (!removeEntry(e, e.getHash(), RemovalCause.SIZE)) {
            throw new AssertionError();
        }
    }
}

LRU implementation details:

accessQueue is a doubly-linked list: nodes move to the tail on access, eviction removes from the head.
Approximate LRU : Read operations first put access records into recencyQueue (lock-free), then batch-drain to accessQueue (requires lock).
Does not support LFU, FIFO, or other eviction strategies.

Why recencyQueue exists --- avoiding lock on every read:

Moving a node in the accessQueue (doubly-linked list) requires updating 4 pointers and is not thread-safe without a lock. If every get() acquired the Segment lock just to update LRU position, read throughput would collapse. The recencyQueue (ConcurrentLinkedQueue) acts as a lock-free buffer:

复制代码

Read hot path (NO lock):
  1. Get value from hash table (volatile read, lock-free)
  2. recencyQueue.add(entry)  ← ConcurrentLinkedQueue.offer(), lock-free
  3. Return value

Later (WITH lock, batched during drainRecencyQueue()):
  while (entry = recencyQueue.poll()) {
      if (accessQueue.contains(entry))    // Guard: skip if evicted between buffer and drain
          accessQueue.moveToTail(entry);  // 4 pointer unlink + 4 pointer re-link at tail
  }

Amortized cost: ~28ns per read (20ns for lock-free offer + amortized drain cost across 64 reads).

Approximate LRU --- why it barely matters:

Coarse-level correctness preserved: Keys not accessed in this batch remain at the head as eviction candidates. All recently-accessed keys are near the tail, protected from eviction.
Error is bounded: Maximum ordering error is limited to entries within the same drain batch (at most 64 reads). Entries from different batches maintain correct relative ordering.
Eviction targets are far from the tail: Eviction removes from the head --- entries that haven't been accessed for many batches.
Temporal locality in real workloads: Hot keys get re-accessed across many batches, accumulating at the tail regardless of per-batch ordering noise.

13. Cleanup Timing

Guava Cache cleanup is lazy, occurring at these times:

Trigger	When	What
`postWriteCleanup()`	Every write	Small cleanup pass
`postReadCleanup()`	Every 64th read	Full `cleanUp()`
`cache.cleanUp()`	Manual	Full cleanup

java 复制代码

static final int DRAIN_THRESHOLD = 0x3F;  // 63

void postReadCleanup() {
    if ((readCount.incrementAndGet() & DRAIN_THRESHOLD) == 0) {
        cleanUp();
    }
}

void cleanUp() {
    // 1. Process ReferenceQueue (weak/soft refs reclaimed by GC)
    drainReferenceQueues();
    // 2. Clean expired entries
    drainRecencyQueue();
    expireEntries(ticker.read());
    // 3. Execute eviction
    evictEntries(newestEntry);
}

readCount.incrementAndGet() is an atomic CAS-based increment (~5-20ns)
& 0x3F is a fast modulo-64 check (bitwise AND vs division)
Amortizes cleanup cost: 63 reads are free, 1 read pays for cleanup

postWriteCleanup vs postReadCleanup : Writes run cleanUp() unconditionally (every write). Reads amortize to every 64th. This is acceptable because writes are rare and already hold the lock, while reads are the hot path.

tryLock() in runCleanup() : Segment extends ReentrantLock, so tryLock() IS the Segment lock. It's non-blocking --- if another thread holds the lock, cleanup is simply skipped (opportunistic). Cleanup will happen on the next write or 64th read.

What cleanup physically does : expireEntries() calls removeEntry() which unlinks the entry from the hash table (copy-on-write chain rebuild), accessQueue, and writeQueue, decrements weight/count. After removal, the entry/key/value objects have zero references from the cache → GC reclaims them at next collection.

14. Refresh Concurrency Internals

java 复制代码

// refreshAfterWrite trigger condition
V scheduleRefresh(ReferenceEntry<K, V> entry, K key, int hash,
                  V oldValue, long now, CacheLoader<K, V> loader) {
    if (refreshes()
        && (now - entry.getWriteTime() > refreshNanos)
        && !entry.getValueReference().isLoading()) {

        V newValue = refresh(key, hash, loader, true);
        if (newValue != null) return newValue;
    }
    return oldValue;  // Return old value during refresh, non-blocking
}

How Other Threads Know Data Is Loading

The LoadingValueReference serves as both a flag and synchronization primitive:

java 复制代码

static class LoadingValueReference<K, V> implements ValueReference<K, V> {
    final ValueReference<K, V> oldValue;           // Old value (for refresh)
    final SettableFuture<V> futureValue;           // Completes when load finishes

    public boolean isLoading() { return true; }    // Always true = "loading" marker
    public V get() { return oldValue.get(); }      // Returns old value during refresh
    public V waitForValue() {                      // Blocks until future resolves
        return Uninterruptibles.getUninterruptibly(futureValue);
    }
}

Refresh scenario (old value exists):

Thread 1: Detects refresh needed → installs LoadingValueReference → submits async reload() → returns old value
Thread 2: getLiveValue() returns old value via LoadingValueReference.get() → scheduleRefresh() sees isLoading()==true → skips refresh → returns old value immediately (NO blocking)

Cold miss scenario (no old value):

Thread 1: Installs LoadingValueReference → executes loader.load() → calls loadingRef.set(value)
Thread 2: Sees isLoading()==true → calls waitForLoadingValue() → blocks on futureValue.get() → wakes when Thread 1 completes

Key insight : Both refresh() and load() use the same LoadingValueReference as the "loading" marker. The difference is the oldValue field inside it --- during refresh, LoadingValueReference.get() returns the previous valid value (so other threads get stale data without blocking). During a cold miss, get() returns null (so other threads must block via waitForValue() using LockSupport.park()).

Lock usage in refresh() : The Segment lock is briefly acquired to install LoadingValueReference (prevents two threads from both starting a refresh). The actual loader.reload() executes outside the lock. Pattern: lock → install placeholder → unlock → load → lock → store result → unlock.

Part III: Deep Dives

15. ConcurrentLinkedQueue (recencyQueue Implementation)

Lock-Free CAS Algorithm

ConcurrentLinkedQueue uses no locks --- all operations are CAS-based:

java 复制代码

public boolean offer(E e) {
    final Node<E> newNode = new Node<>(e);
    for (Node<E> t = tail, p = t;;) {
        Node<E> q = p.next;
        if (q == null) {
            if (p.casNext(null, newNode)) {  // CAS: only one thread succeeds
                if (p != t) casTail(t, newNode);  // Lazy tail update
                return true;
            }
        } else {
            p = (p != t && t != (t = tail)) ? t : q;  // Advance past stale tail
        }
    }
}

Three Key Techniques

CAS as atomic primitive: No blocking --- failed threads retry immediately
Lazy pointer updates : tail is updated lazily (only when 2+ nodes behind), reducing CAS contention on the tail field
Self-linking for safe removal : Dequeued nodes' next points to themselves, signaling traversers to restart from head

Why Guava Chose It for recencyQueue

offer() is lock-free → doesn't kill read throughput
Ordering precision doesn't matter (approximate LRU is acceptable)
poll() during drain is single-threaded (Segment lock held) → no dequeue contention
Queue stays short (≤64 entries between drains)

16. Eviction & Cleanup Internals

`drainReferenceQueues()` --- Detecting GC'd Weak/Soft Entries

Polls the JVM's ReferenceQueue to discover which weak/soft-referenced keys or values have been garbage collected:

java 复制代码

void drainKeyReferenceQueue() {
    Reference<? extends K> ref;
    int i = 0;
    while ((ref = keyReferenceQueue.poll()) != null) {  // Non-blocking poll
        ReferenceEntry<K, V> entry = (ReferenceEntry<K, V>) ref;
        map.reclaimKey(entry);  // → acquires Segment lock → removeEntry(COLLECTED)
        if (++i == DRAIN_MAX) break;  // Bound work per call
    }
}

How the ReferenceQueue gets populated (JVM-managed):

During entry creation: WeakReference<K> keyRef = new WeakReference<>(key, keyReferenceQueue)
GC collects the key → clears the WeakReference → enqueues it into keyReferenceQueue
Next cleanUp() → drainReferenceQueues() → polls the dead reference → removes entry

The casting trick : Guava's entry classes extend WeakReference AND implement ReferenceEntry:

java 复制代码

static class WeakKeyEntry<K, V> extends WeakReference<K>
                                 implements ReferenceEntry<K, V> { ... }

So the enqueued reference IS the entry --- cast it back to access hash, value, and queue pointers.

`drainRecencyQueue()`

java 复制代码

void drainRecencyQueue() {
    ReferenceEntry<K, V> e;
    while ((e = recencyQueue.poll()) != null) {
        if (accessQueue.contains(e)) {
            accessQueue.moveToTail(e);  // Update LRU position
        }
    }
}

`getNextEvictable()`

Walks accessQueue from the head (least recently accessed) and returns the first entry with weight > 0. Zero-weight entries (loading placeholders, GC'd values) are skipped.

`removeEntry()`

Physically removes a cache entry from all data structures:

java 复制代码

boolean removeEntry(ReferenceEntry<K, V> entry, int hash, RemovalCause cause) {
    // 1. Unlink from hash table (copy-on-write chain rebuild)
    ReferenceEntry<K, V> newFirst = removeEntryFromChain(first, entry);
    table.set(index, newFirst);

    // 2. Notify RemovalListener
    enqueueNotification(entry.getKey(), ..., cause);

    // 3. Remove from LRU queues (O(1) DLL unlink)
    accessQueue.remove(entry);
    writeQueue.remove(entry);

    // 4. Update weight and count
    totalWeight -= entry.getValueReference().getWeight();
    this.count = newCount;  // volatile write = publication point
    return true;
}

`removeEntryFromChain()` --- Copy-on-Write Chain Rebuild

Rebuilds the hash bucket's singly-linked list excluding the target entry:

java 复制代码

ReferenceEntry<K, V> removeEntryFromChain(ReferenceEntry<K, V> first,
                                           ReferenceEntry<K, V> entry) {
    ReferenceEntry<K, V> newFirst = entry.getNext();  // Everything after target

    // Copy nodes before target, linking to newFirst (reverses order)
    for (ReferenceEntry<K, V> e = first; e != entry; e = e.getNext()) {
        ReferenceEntry<K, V> next = copyEntry(e, newFirst);
        if (next != null) {
            newFirst = next;
        } else {
            // Weak/soft ref was GC'd --- skip and notify
            enqueueNotification(..., RemovalCause.COLLECTED);
            accessQueue.remove(e);
            writeQueue.remove(e);
        }
    }
    return newFirst;
}

Why copy instead of mutate? Lock-free readers may be traversing the old chain concurrently. Copying ensures old chain remains valid for in-progress readers. Old nodes become garbage once no thread references them.

Visual example --- removing entry C:

复制代码

Original bucket: table[i] → A → B → C(target) → D → E → null

Step 1: newFirst = C.getNext()                    → D → E
Step 2: copy A, A'.next = newFirst                → A' → D → E
Step 3: copy B, B'.next = newFirst                → B' → A' → D → E

Result: table[i] → B' → A' → D → E   (C removed, prefix reversed)
Old chain: A → B → C → D → E          (still valid for in-progress readers)

17. JVM Reference Collection

Weak References --- Collected Every GC Cycle

复制代码

Condition: Object has NO strong or soft references
Timing:    Next GC cycle (Minor or Full), regardless of memory pressure

Soft References --- Collected Before OOM

复制代码

Condition: Object has only soft references
Timing:    When JVM is under memory pressure

HotSpot heuristic: clock - lastAccessTime > freeMemory * SoftRefLRUPolicyMSPerMB (default 1000ms/MB).

Collection Behavior by GC Event

GC Event	Weak Refs	Soft Refs
Minor GC (Young Gen)	✅ Collected if unreachable	❌ Usually retained
Major/Full GC	✅ Collected if unreachable	⚠️ Collected if memory pressure
Near OOM	✅ Collected	✅ Collected (guaranteed before OOM)

How Guava Detects Collection

Full lifecycle:

复制代码

1. Cache stores key as WeakReference registered with keyReferenceQueue
2. External code drops all strong references to the key
3. GC runs → collects key → enqueues WeakReference into keyReferenceQueue
4. Next cache operation triggers postReadCleanup() or postWriteCleanup()
5. cleanUp() → drainReferenceQueues() → polls keyReferenceQueue
6. Finds dead reference → casts to ReferenceEntry → removeEntry(COLLECTED)
7. Entry removed from hash table, accessQueue, writeQueue
8. RemovalListener notified with cause = COLLECTED

18. Thread Blocking --- `LockSupport.park()` vs `Object.wait()`

Comparison

Mechanism	Requires Lock?	Level
`Object.wait()`	✅ Must hold `synchronized` monitor	High-level
`Condition.await()`	✅ Must hold `ReentrantLock`	High-level
`LockSupport.park()`	❌ No lock needed	Low-level (OS primitive)

park/unpark is what ReentrantLock, CountDownLatch, Semaphore, FutureTask, and AQS are built on.

The Permit Mechanism (No Lost-Wakeup)

java 复制代码

// This works even if unpark is called BEFORE park:
LockSupport.unpark(threadA);  // Grants a permit (stored as flag)
// ... later ...
LockSupport.park();           // Consumes permit, returns immediately

No lost-wakeup problem because the permit persists until consumed.

`Object.wait()` Lost-Wakeup Problem

java 复制代码

// WITHOUT synchronized --- BROKEN:
// Thread A:                    Thread B:
if (!condition) {              condition = true;
                               obj.notify();  // LOST! Thread A not in wait() yet
    obj.wait();                               // Thread A waits FOREVER
}

synchronized makes check-then-wait atomic, preventing the race.

`AbstractFuture.get()` --- How Waiters Work

java 复制代码

public V get() throws ExecutionException {
    if (state > COMPLETING) return getValue();  // Fast path

    // Slow path: create waiter, CAS into lock-free stack
    Waiter node = new Waiter(Thread.currentThread());
    for (;;) {
        Waiter oldHead = this.waiters;       // volatile read
        node.next = oldHead;
        if (WAITERS.compareAndSet(this, oldHead, node)) break;
    }

    // Park until future completes
    while (true) {
        LockSupport.park(this);              // Thread sleeps (zero CPU)
        if (this.state > COMPLETING) {
            node.thread = null;
            return getValue();
        }
    }
}

When set(value) is called:

Stores value, transitions state to COMPLETED
Atomically grabs entire waiter stack: WAITERS.getAndSet(this, null)
Walks the stack calling LockSupport.unpark(waiter.thread) for each
All parked threads wake up and return the value

Waiter Storage Comparison

	`Object.wait()`	`AbstractFuture.get()`
Storage	JVM native `ObjectMonitor._WaitSet` (C++)	Java field `AbstractFuture.waiters` (heap)
Structure	Doubly-linked list (JVM-managed)	Singly-linked stack (CAS-managed)
Visible to Java?	❌	✅
Requires lock?	✅ `synchronized`	❌ Lock-free
Wake mechanism	`notify()` → `pthread_cond_signal`	`unpark()` → OS wake

Why Guava Uses `park/unpark`

No lock required --- reduces contention
Targeted wakeup --- unpark(specificThread) vs notifyAll() thundering herd
No lost-wakeup risk --- permit mechanism
No monitor inflation overhead
Composable --- foundation for building complex synchronizers

19. Weak Key Lookup and Identity Semantics

Weak Key Lookup Behavior

When weakKeys() is enabled and get(key) is called:

java 复制代码

for (ReferenceEntry<K, V> e = first; e != null; e = e.getNext()) {
    K entryKey = e.getKey();  // WeakReference.get() --- null if GC'd

    if (entryKey == null) {
        continue;  // Dead entry, skip (will be cleaned by drainReferenceQueues)
    }
    if (e.getHash() == hash && entryKey == key) {  // Identity check (==)
        // Cache hit
    }
}

Two miss scenarios:

e.getKey() returns null → key was GC'd → entry is dead → skip
e.getKey() returns a live object but entryKey == key is false → different object instance → no match

Ensuring Cache Hits with `weakKeys()`

Pattern	Works?	Why
String literals (`"config"`)	✅	JVM interns literals --- same object
Enum values (`MyEnum.FOO`)	✅	Singletons --- same object
Domain object held by caller	✅	Same reference reused
`new String("key")`	❌	Different object each time
Canonical map (interning)	✅	Returns same instance for same logical key

The intended use case : weakKeys() means "cache this data only as long as the key is alive elsewhere in my application." When external code drops all strong references to the key, the cache entry auto-evicts.

Part IV: Reference

20. Common Pitfalls

refreshAfterWrite alone → stale values persist forever if loader fails. Always pair with expireAfterWrite.
Synchronous RemovalListener → executes while holding Segment lock. Use RemovalListeners.asynchronous().
weakKeys() → changes key comparison from equals() to ==. new String("key") won't match a literal.
Loader returning null → throws InvalidCacheLoadException. Use Optional or getIfPresent() for nullable lookups.
Forgot recordStats() → stats() returns all zeros (looks like 0% hit rate).
maximumSize + maximumWeight → mutually exclusive, throws IllegalStateException.

Expiration precision in tests --- use a custom Ticker instead of waiting:

java 复制代码

FakeTicker ticker = new FakeTicker();
Cache<K, V> cache = CacheBuilder.newBuilder()
    .expireAfterWrite(1, TimeUnit.MINUTES)
    .ticker(ticker::read)
    .build();
cache.put("k", v);
ticker.advance(2, TimeUnit.MINUTES);
assertNull(cache.getIfPresent("k"));

21. Limitations & Why Caffeine Replaced It

Guava	Caffeine
Segment locks	Lock-free (MPSC + CAS)
LRU only	W-TinyLFU (better hit rate)
No async API	`AsyncLoadingCache`
Lazy cleanup only	`Scheduler` for proactive cleanup
Fixed TTL per cache	`Expiry` for per-entry TTL
No longer evolving	Actively maintained (same author)

22. Complete Usage Example

java 复制代码

LoadingCache<String, Widget> cache = CacheBuilder.newBuilder()
    .maximumSize(10_000)
    .expireAfterWrite(10, TimeUnit.MINUTES)
    .expireAfterAccess(5, TimeUnit.MINUTES)
    .refreshAfterWrite(1, TimeUnit.MINUTES)
    .concurrencyLevel(16)
    .recordStats()
    .removalListener(RemovalListeners.asynchronous(
        notification -> log.info("Evicted: {} cause={}",
            notification.getKey(), notification.getCause()),
        executor))
    .build(new CacheLoader<String, Widget>() {
        @Override
        public Widget load(String key) throws Exception {
            return loadFromDatabase(key);
        }

        @Override
        public ListenableFuture<Widget> reload(String key, Widget oldValue) {
            return executor.submit(() -> loadFromDatabase(key));
        }

        @Override
        public Map<String, Widget> loadAll(Iterable<? extends String> keys) {
            return database.findAllByIds(keys);
        }
    });

// Various usage patterns
Widget widget = cache.get("key");                    // Auto-load
Widget widget = cache.getIfPresent("key");           // No loading triggered
Widget widget = cache.get("key", () -> compute());   // Similar to computeIfAbsent
ImmutableMap<String, Widget> batch = cache.getAll(keys);  // Batch
cache.put("key", widget);
cache.invalidate("key");
cache.invalidateAll();
long size = cache.size();
cache.cleanUp();  // Manually trigger cleanup

23. Interview Key Points

Segment-lock design : Similar to JDK7 CHM, segment count controlled by concurrencyLevel (rounded up to power of 2).
Lazy expiration: No background thread; cleanup triggered on read/write; reads trigger cleanUp every 64 operations.
Cache stampede prevention : LoadingValueReference + waitForValue() ensures same key is loaded only once; loading executes outside the lock.
refresh vs expire : refresh returns old value without blocking (requires overriding reload() for true async); expire blocks waiting for new value.
Approximate LRU : recencyQueue batch-drains to accessQueue; not strictly LRU under high concurrency but error is bounded to one drain batch.
RemovalListener pitfall : Executes synchronously while holding Segment lock; time-consuming operations must use RemovalListeners.asynchronous().
Weak reference pitfall : Enabling weakKeys/Values or softValues changes key/value comparison from equals to ==.
Why replaced by Caffeine: LRU < W-TinyLFU hit rate, segment-lock < lock-free performance (MPSC + CAS), lacks async API, Scheduler, per-entry TTL.

See also: guava-cache-owms-patterns.md for OWMS service-specific cache patterns and Guava library reference.

Guava Cache Deep Dive