redis7.x源码分析:(9) 内存淘汰策略

Redis 7.x 源码分析：(9) 内存淘汰机制

背景

当 Redis 配置的可用内存空间不足时，必须要有机制能够释放出一些内存，以保证服务正常运行。虽然 Redis 已经有过期键清理机制了，但如果过期键都已经被清理了，而内存仍然不够时该怎么办？

这个时候就需要用到 Redis 的内存淘汰机制，其在面对内存不足时，自动清理部分数据以释放内存空间，确保服务能够持续运行。本文将深入探讨 Redis 7.x 的内存淘汰机制，包括其策略、触发条件以及相关源码分析。

一、内存淘汰策略配置

Redis 提供了多种内存淘汰策略，这些策略可以在配置文件中通过 maxmemory-policy 参数进行设置。以下是 Redis 支持的主要内存淘汰策略：

参数名称	描述	适用场景
`noeviction`	当内存达到上限时，新写入操作会报错，不会淘汰任何数据。	适用于不允许数据丢失的场景，但可能会导致写入操作失败。
`allkeys-lru`	从所有键中淘汰最近最少使用的键。	适用于大多数键都会被频繁访问的场景，通过淘汰不常用的键来释放内存。
`volatile-lru`	仅从设置了过期时间的键中淘汰最近最少使用的键。	适用于部分键设置了过期时间的场景，优先淘汰不常用的过期键。
`allkeys-random`	从所有键中随机淘汰键。	适用于对数据访问模式没有明确规律的场景，随机淘汰可以避免热点数据被过度淘汰。
`volatile-random`	仅从设置了过期时间的键中随机淘汰键。	适用于部分键设置了过期时间的场景，随机淘汰过期键以释放内存。
`volatile-ttl`	仅从设置了过期时间的键中淘汰那些即将过期的键。	适用于部分键设置了过期时间的场景，优先淘汰即将过期的键以释放内存。
`allkeys-lfu`	从所有键中淘汰最近最少使用频率的键。	适用于访问频率差异较大的场景，通过淘汰不常访问的键来释放内存。
`volatile-lfu`	仅从设置了过期时间的键中淘汰最近最少使用频率的键。	适用于部分键设置了过期时间的场景，优先淘汰不常访问的过期键。

这些策略的选择取决于具体的应用场景和数据访问模式。

Redis 内存淘汰的触发条件主要与内存使用量有关。当 Redis 的内存使用量达到配置的 maxmemory 限制时，内存淘汰机制会被触发。

配置示例：

conf 复制代码

# 4 GB 上限
maxmemory 4gb
# 通用缓存淘汰
maxmemory-policy allkeys-lru
# 采样数（默认 5）
maxmemory-samples 5

二、内存淘汰的源码分析

淘汰策略和源码中的定义对照：

配置项	源码常量
noeviction	MAXMEMORY_NO_EVICTION
allkeys-lru	MAXMEMORY_ALLKEYS_LRU
volatile-lru	MAXMEMORY_VOLATILE_LRU
allkeys-random	MAXMEMORY_ALLKEYS_RANDOM
volatile-random	MAXMEMORY_VOLATILE_RANDOM
volatile-ttl	MAXMEMORY_VOLATILE_TTL
allkeys-lfu	MAXMEMORY_ALLKEYS_LFU
volatile-lfu	MAXMEMORY_VOLATILE_LFU

内存淘汰机制的核心逻辑位于 evict.c 文件中。由于其它模式处理比较简单，因此我们着重分析LRU和LFU的实现。以下是内存淘汰机制的关键源码分析：

1. LRU淘汰模式

先看数据库中数据存储结构 redisObject 的定义：

c 复制代码

typedef struct redisObject {
    // 存储类型
    unsigned type:4;
    // 编码类型（内部数据结构，见 OBJ_ENCODING 相关定义）
    unsigned encoding:4;
    // LRU和LFU共用，大小24bit。当记录LRU时，表示访问时间；当记录LFU信息时，使用高16位记录上次访问时间，低8位记录访问频率
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time). */
    // 引用计数
    int refcount;
    // 指向具体数据的指针
    void *ptr;
} robj;

成员lru是LRU和LFU共用的。当使用LRU模式时，被用来记录访问时间，用24bit存储 LRU_CLOCK() 返回的时间戳（以秒为单位）表示，因此该时间戳最多190多天会回绕一次。

在读写key时会调用 lookupKey 进行更新，另外在创建新的key时也会进行赋值，需要注意的是赋值时会区分当前是LRU还是LFU模式。

2. LFU淘汰模式

从上面的 redisObject 定义可以看出，当使用LFU模式时，lru会被分为 16bit 访问时间（分钟为单位）和 8bit 访问频率两部分。

访问时间使用以分钟记时间戳的低16位表示。它的作用是能够根据逝去的时间不断衰减访问频率。否则如果频率只增不减，那么后插入数据库的数据总是会被先淘汰，这显然不符合使用预期，因为新的数据往往更重要。

访问频率使用访问概率来表示访问频次。由于只有8bit大小最大只有255，单纯记录访问次数显然是不够用的，因此采用了对数计数器的方式表示访问概率。

(1) 增加访问频率计数器

c 复制代码

// 基于概率的对数LFU计数器递增函数, 计数器counter范围: 0-255
uint8_t LFULogIncr(uint8_t counter) {
    if (counter == 255) return 255;
    // rand()函数返回一个0到RAND_MAX之间的伪随机整数, 因此r的取值范围是0.0到1.0之间的浮点数
    double r = (double)rand()/RAND_MAX;
    // LFU_INIT_VAL为5, 因此counter小于5时, baseval为0, counter一定会被递增
    double baseval = counter - LFU_INIT_VAL;
    if (baseval < 0) baseval = 0;
    // counter越大, 被递增的概率越小, server.lfu_log_factor 默认值为10, 增长到255的概率非常小，大概需要100w次访问
    double p = 1.0/(baseval*server.lfu_log_factor+1);
    if (r < p) counter++;
    return counter;
}

(2) 减少访问频率计数器

c 复制代码

// 将时间转换为分钟, 并取低16位
unsigned long LFUGetTimeInMinutes(void) {
    return (server.unixtime/60) & 65535;
}

// 计算自上次访问到现在经过的分钟数
unsigned long LFUTimeElapsed(unsigned long ldt) {
    unsigned long now = LFUGetTimeInMinutes();
    if (now >= ldt) return now-ldt;
    return 65535-ldt+now;
}

// 返回衰减后的计数器（后续LFULogIncr时，更新ldt）
unsigned long LFUDecrAndReturn(robj *o) {
    // 最后访问时间（24bit前16bit）
    unsigned long ldt = o->lru >> 8;
    // 访问计数器（24bit后8bit）
    unsigned long counter = o->lru & 255;
    // server.lfu_decay_time 默认为1分钟（表示每分钟衰减一次）
    unsigned long num_periods = server.lfu_decay_time ? LFUTimeElapsed(ldt) / server.lfu_decay_time : 0;
    // 根据经过的衰减周期数对计数器进行衰减
    if (num_periods)
        counter = (num_periods > counter) ? 0 : counter - num_periods;
    return counter;
}

(3) 更新访问频率计数器

上面提到的每次读写key时会调用 lookupKey 进行更新，当使用LFU模式时后续会调用 updateLFU：

c 复制代码

void updateLFU(robj *val) {
    // 先衰减计数器
    unsigned long counter = LFUDecrAndReturn(val);
    // 再基于概率对数递增计数器
    counter = LFULogIncr(counter);
    // 更新最后访问时间和计数器
    val->lru = (LFUGetTimeInMinutes()<<8) | counter;
}

代码中涉及到的对数因子和衰减时间可以通过配置修改：

conf 复制代码

lfu-log-factor 10
lfu-decay-time 1

3. 淘汰执行过程

目前代码中有两种情况下会执行淘汰。一个是在 processCommand 函数处理命令前，另一个是在处理具体 CONFIG SET maxmemory <bytes> 命令的时候。

processCommand 会直接调用具体的 performEvictions 淘汰函数：

c 复制代码

int processCommand(client *c) {
    ...
    if (server.maxmemory && !isInsideYieldingLongCommand()) {
        int out_of_memory = (performEvictions() == EVICT_FAIL);
		...
	}
	...
}

CONFIG SET maxmemory 会调用 startEvictionTimeProc 函数，它会启动一个定时器，在下个loop执行 performEvictions 函数：

c 复制代码

static int evictionTimeProc(
        struct aeEventLoop *eventLoop, long long id, void *clientData) {
    UNUSED(eventLoop);
    UNUSED(id);
    UNUSED(clientData);

    // 继续执行内存淘汰
    if (performEvictions() == EVICT_RUNNING) return 0;  /* keep evicting */

    /* For EVICT_OK - things are good, no need to keep evicting.
     * For EVICT_FAIL - there is nothing left to evict.  */
    isEvictionProcRunning = 0;
    return AE_NOMORE;
}

void startEvictionTimeProc(void) {
    if (!isEvictionProcRunning) {
        isEvictionProcRunning = 1;
        // 注册定时器
        aeCreateTimeEvent(server.el, 0,
                evictionTimeProc, NULL, NULL);
    }
}

真正执行淘汰的 performEvictions 相关函数实现：

performEvictions 会调用 evictionPoolPopulate，将随机采样的key按照淘汰规则idle值从小到大排列，填充到pool中。

c 复制代码

// 淘汰数据采样函数
void evictionPoolPopulate(int dbid, dict *sampledict, dict *keydict, struct evictionPoolEntry *pool) {
    int j, k, count;
    dictEntry *samples[server.maxmemory_samples];

    // 从数据库随机采样server.maxmemory_samples(默认5)个键
    count = dictGetSomeKeys(sampledict,samples,server.maxmemory_samples);
    for (j = 0; j < count; j++) {
        unsigned long long idle;
        sds key;
        robj *o;
        dictEntry *de;

        de = samples[j];
        key = dictGetKey(de);

        if (server.maxmemory_policy != MAXMEMORY_VOLATILE_TTL) {
            // 如果sampledict是expires字典, 则需要在主字典中查找获取对象
            if (sampledict != keydict) de = dictFind(keydict, key);
            o = dictGetVal(de);
        }

        if (server.maxmemory_policy & MAXMEMORY_FLAG_LRU) {
            // 计算空闲时间
            idle = estimateObjectIdleTime(o);
        } else if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
            // 计算空闲访问频率
            idle = 255-LFUDecrAndReturn(o);
        } else if (server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL) {
            // 计算空闲过期时间
            idle = ULLONG_MAX - (long)dictGetVal(de);
        } else {
            serverPanic("Unknown eviction policy in evictionPoolPopulate()");
        }

        /*
         * 寻找要插入的位置 k：从左到右找到第一个空槽或者第一个比当前 idle 值小的位置。
         * pool[] 中按 idle 升序排列（左小右大），因此找到的位置 k 是我们要插入的位置。
         */
        k = 0;
        while (k < EVPOOL_SIZE &&
               pool[k].key &&
               pool[k].idle < idle) k++;

        /*
         * cached 是evictionPoolAlloc时为pool数组中16个成员预分配的 SDS 空间，用于存储 key 字符串以供复用，减少频繁分配/释放。
         * 三种主要情况：
         * 1) k == 0 且最右侧槽已被占用：新元素比池中所有元素都小，而且没有空槽，无法插入 -> skip。
         * 2) k 指向一个空槽（k < EVPOOL_SIZE 且 pool[k].key == NULL）：直接插入，无需移动。
         * 3) 否则需要在中间插入，需要移动元素腾出位置：
         *    - 如果最右侧有空余（pool[EVPOOL_SIZE-1].key == NULL），则把 [k .. end-1] 向右移动一位，
         *      然后把插入位置的 cached 字段恢复为之前保存的 SDS 指针，避免覆盖丢失。
         *    - 如果右侧没有空余（池已满），则我们丢弃左边最小的元素（pool[0]），
         *      把 [1 .. k] 向左移动一位，然后把保存的 cached 放到 pool[k] 中用于复用。
         */
        if (k == 0 && pool[EVPOOL_SIZE-1].key != NULL) {
            continue;
        } else if (k < EVPOOL_SIZE && pool[k].key == NULL) {
            /* Inserting into empty position. No setup needed before insert. */
        } else {
            if (pool[EVPOOL_SIZE-1].key == NULL) {
                /*
                 * 右侧有空位：将 [k .. EVPOOL_SIZE-2] 整体右移一位，腾出 pool[k]
                 * 但在移动前需要保存最右侧 entry 的 cached SDS 指针，
                 * 否则 memmove 会覆盖它导致内存丢失。
                 */

                /* Save SDS before overwriting. */
                sds cached = pool[EVPOOL_SIZE-1].cached;
                memmove(pool+k+1,pool+k,
                    sizeof(pool[0])*(EVPOOL_SIZE-k-1));
                /* 恢复 cached 指针到插入位置，后续复用该 cached 存储 key 字符串 */
                pool[k].cached = cached;
            } else {
                /*
                 * 池已满且要在中间插入：我们选择丢弃最左边（最小）的元素。
                 * 具体做法是：k--（插入点左移一位），将 [1 .. k] 整体左移一位，
                 * 这会覆盖 pool[0]，所以在覆盖前需要释放其 key（如果是动态分配的），
                 * 并保存 pool[0].cached 以供复用。
                 */
                k--;
                sds cached = pool[0].cached; /* Save SDS before overwriting. */
                if (pool[0].key != pool[0].cached) sdsfree(pool[0].key);
                memmove(pool,pool+1,sizeof(pool[0])*k);
                pool[k].cached = cached;
            }
        }

        /*
         * 复用 pool[k] 中预分配的 cached SDS 空间存放 key，避免频繁分配/释放。
         * - 如果 key 长度超过缓存大小，则为 key 分配单独的 SDS 并赋给 pool[k].key（当 cached != key，需要单独 sdsfree 释放）。
         * - 否则将 key 拷贝到 pool[k].cached，并把 pool[k].key 指向该 cached。
         */
        int klen = sdslen(key);
        if (klen > EVPOOL_CACHED_SDS_SIZE) {
            pool[k].key = sdsdup(key);
        } else {
            memcpy(pool[k].cached,key,klen+1);
            sdssetlen(pool[k].cached,klen);
            pool[k].key = pool[k].cached;
        }
        /* 设置得分与所属 DB id */
        pool[k].idle = idle;
        pool[k].dbid = dbid;
    }
}

performEvictions 函数完成整个淘汰逻辑。

c 复制代码

// 是否可以进行内存淘汰
static int isSafeToPerformEvictions(void) {
    // 没有lua脚本在超时状态且现在没有正在加载数据
    if (isInsideYieldingLongCommand() || server.loading) return 0;

    // 只有主节点才允许进行内存淘汰
    if (server.masterhost && server.repl_slave_ignore_maxmemory) return 0;

    // 当客户端被暂停时, 不进行内存淘汰，因为数据库不会变化
    if (checkClientPauseTimeoutAndReturnIfPaused()) return 0;

    return 1;
}

/*
 * 执行内存淘汰
 * 返回值:
 *   EVICT_OK       - 内存使用在限制范围内，或者现在不适合进行内存淘汰
 *   EVICT_RUNNING  - 内存使用超过限制，但内存淘汰仍在进行中
 *   EVICT_FAIL     - 内存使用超过限制，但没有可淘汰的键
 */
int performEvictions(void) {
    // 是否可以进行内存淘汰
    if (!isSafeToPerformEvictions()) return EVICT_OK;

    int keys_freed = 0;
    size_t mem_reported, mem_tofree;
    long long mem_freed; /* May be negative */
    mstime_t latency, eviction_latency;
    long long delta;
    int slaves = listLength(server.slaves);
    int result = EVICT_FAIL;

    // 获取需要释放的内存大小，mem_tofree 需要释放的内存大小
    if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK) {
        result = EVICT_OK;
        goto update_metrics;
    }

    if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION) {
        result = EVICT_FAIL;  /* We need to free memory, but policy forbids. */
        goto update_metrics;
    }

    // 计算本次内存淘汰的时间限制
    unsigned long eviction_time_limit_us = evictionTimeLimitUs();

    mem_freed = 0;

    latencyStartMonitor(latency);

    monotime evictionTimer;
    elapsedStart(&evictionTimer);

    int prev_core_propagates = server.core_propagates;
    serverAssert(server.also_propagate.numops == 0);
    server.core_propagates = 1;
    server.propagate_no_multi = 1;

    // 循环释放，直到满足释放目标
    while (mem_freed < (long long)mem_tofree) {
        int j, k, i;
        static unsigned int next_db = 0;
        sds bestkey = NULL;
        int bestdbid;
        redisDb *db;
        dict *dict;
        dictEntry *de;

        if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||
            server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)
        {
            struct evictionPoolEntry *pool = EvictionPoolLRU;

            while (bestkey == NULL) {
                unsigned long total_keys = 0, keys;

                for (i = 0; i < server.dbnum; i++) {
                    db = server.db+i;
                    dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ?
                            db->dict : db->expires;
                    if ((keys = dictSize(dict)) != 0) {
                        // 随机采样若干键, 按照空闲度由低到高填充到淘汰池中
                        evictionPoolPopulate(i, dict, db->dict, pool);
                        total_keys += keys;
                    }
                }
                if (!total_keys) break; /* No keys to evict. */

                /* Go backward from best to worst element to evict. */
                // 遍历淘汰池, 找到空闲度最高且仍存在的键作为淘汰对象
                for (k = EVPOOL_SIZE-1; k >= 0; k--) {
                    if (pool[k].key == NULL) continue;
                    bestdbid = pool[k].dbid;

                    if (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {
                        de = dictFind(server.db[bestdbid].dict,
                            pool[k].key);
                    } else {
                        de = dictFind(server.db[bestdbid].expires,
                            pool[k].key);
                    }

                    /* Remove the entry from the pool. */
                    // 清理pool[k]，释放pool[k]中的key字符串（如果是动态分配的）
                    if (pool[k].key != pool[k].cached)
                        sdsfree(pool[k].key);
                    pool[k].key = NULL;
                    pool[k].idle = 0;

                    /* If the key exists, is our pick. Otherwise it is
                     * a ghost and we need to try the next element. */
                    if (de) {
                        bestkey = dictGetKey(de);
                        break;
                    } else {
                        /* Ghost... Iterate again. */
                    }
                }
            }
        }

        /* volatile-random and allkeys-random policy */
        else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM ||
                 server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)
        {
            // 遍历所有数据库, 随机选择一个键进行淘汰，并且使用 next_db 变量记录下次开始的数据库索引
            for (i = 0; i < server.dbnum; i++) {
                j = (++next_db) % server.dbnum;
                db = server.db+j;
                dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ?
                        db->dict : db->expires;
                if (dictSize(dict) != 0) {
                    de = dictGetRandomKey(dict);
                    bestkey = dictGetKey(de);
                    bestdbid = j;
                    break;
                }
            }
        }

        /* Finally remove the selected key. */
        // 从数据库中删除选中的键
        if (bestkey) {
            db = server.db+bestdbid;
            robj *keyobj = createStringObject(bestkey,sdslen(bestkey));

            delta = (long long) zmalloc_used_memory();
            latencyStartMonitor(eviction_latency);
            // 是否延迟淘汰，延迟淘汰会在bio后台线程中异步删除键
            if (server.lazyfree_lazy_eviction)
                dbAsyncDelete(db,keyobj);
            else
                dbSyncDelete(db,keyobj);
            latencyEndMonitor(eviction_latency);
            latencyAddSampleIfNeeded("eviction-del",eviction_latency);
            delta -= (long long) zmalloc_used_memory();
            mem_freed += delta;
            server.stat_evictedkeys++;
            signalModifiedKey(NULL,db,keyobj);
            notifyKeyspaceEvent(NOTIFY_EVICTED, "evicted",
                keyobj, db->id);
            propagateDeletion(db,keyobj,server.lazyfree_lazy_eviction);
            decrRefCount(keyobj);
            keys_freed++;

            if (keys_freed % 16 == 0) {
                /* 当释放的内存量变得足够大时, 我们可能会在这里花费大量时间,
                 * 以至于无法足够快地将数据传输到从节点, 因此我们在循环内强制进行传输. */
                if (slaves) flushSlavesOutputBuffers();

                /* 如果延迟淘汰正在进行, 则需要不时的检查我们是否已经达到了淘汰内存目标,
                 * 因为内存正在bio线程中释放。
                 */
                if (server.lazyfree_lazy_eviction) {
                    if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {
                        break;
                    }
                }

                // 执行时间超过限制则退出循环，以免在这里花费太多时间，而后开启定时器继续淘汰
                if (elapsedUs(evictionTimer) > eviction_time_limit_us) {
                    // We still need to free memory - start eviction timer proc
                    startEvictionTimeProc();
                    break;
                }
            }
        } else {
            goto cant_free; /* nothing to free... */
        }
    }
    /* at this point, the memory is OK, or we have reached the time limit */
    result = (isEvictionProcRunning) ? EVICT_RUNNING : EVICT_OK;

cant_free:
    if (result == EVICT_FAIL) {
        mstime_t lazyfree_latency;
        latencyStartMonitor(lazyfree_latency);
        // 如果有延迟释放的任务, 则等待一段时间以便释放内存，并且判断内存是否已经达标
        while (bioPendingJobsOfType(BIO_LAZY_FREE) &&
              elapsedUs(evictionTimer) < eviction_time_limit_us) {
            if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {
                result = EVICT_OK;
                break;
            }
            usleep(eviction_time_limit_us < 1000 ? eviction_time_limit_us : 1000);
        }
        latencyEndMonitor(lazyfree_latency);
        latencyAddSampleIfNeeded("eviction-lazyfree",lazyfree_latency);
    }

    serverAssert(server.core_propagates); /* This function should not be re-entrant */

    /* Propagate all DELs */
    propagatePendingCommands();

    server.core_propagates = prev_core_propagates;
    server.propagate_no_multi = 0;

    latencyEndMonitor(latency);
    latencyAddSampleIfNeeded("eviction-cycle",latency);

update_metrics:
    if (result == EVICT_RUNNING || result == EVICT_FAIL) {
        if (server.stat_last_eviction_exceeded_time == 0)
            elapsedStart(&server.stat_last_eviction_exceeded_time);
    } else if (result == EVICT_OK) {
        if (server.stat_last_eviction_exceeded_time != 0) {
            server.stat_total_eviction_exceeded_time += elapsedUs(server.stat_last_eviction_exceeded_time);
            server.stat_last_eviction_exceeded_time = 0;
        }
    }
    return result;
}

总结

Redis 的内存淘汰机制是其在面对内存不足时，确保服务能够持续运行的重要特性。通过多种淘汰策略和高效的算法实现，Redis 能够在内存使用达到上限时自动清理部分数据，释放内存空间。合理配置内存淘汰策略和参数，可以有效提升 Redis 的性能和可靠性。