Python 之 cachetools 缓存工具

cachetools 提供了各种内存缓存的实现。它可以用于函数结果缓存、对象缓存等场景，能够有效提升程序性能，减少重复计算。

安装

复制代码

pip install cachetools

Cache

Cache 可变映射作为缓存的简单缓存，也可以直接传入空字典 {}。以递归求解斐波拉契结果为例，我们可以看下使用缓存和不使用缓存的时间消耗对比。

python 复制代码

import time
from cachetools import cached, Cache


# without cached
def fib(n):
    if n < 2:
        return n
    return fib(n - 1) + fib(n - 2)


start_time = time.time()
print(fib(40))
end_time = time.time()
print("Time Taken:", end_time - start_time)


# Now using cached
# Use this decorator to enable caching
@cached(cache=Cache(maxsize=50))
def fib(n):
    if n < 2:
        return n
    return fib(n - 1) + fib(n - 2)


start_time = time.time()
print(fib(40))
end_time = time.time()
print("Time Taken:", end_time - start_time)

python 复制代码

102334155
Time Taken: 15.232901096343994
102334155
Time Taken: 0.0010046958923339844

从结果可以看出，当求值的输入越大的时候，时间耗费越悬殊。

这源自于缓存会将递归调用的中间结果保存下来，后续递归调用中碰到相同输入的重复调用时，就可以直接使用已缓存的结果，从而提高性能。

LRUCache（Least Recently Used）

LRUCache最近最少使用缓存，会优先移除最久未被访问的项目。

python 复制代码

from cachetools import LRUCache

cache = LRUCache(maxsize=3)

cache['a'] = 1
cache['b'] = 2
cache['c'] = 3
print(cache)
# LRUCache({'a': 1, 'b': 2, 'c': 3}, maxsize=3, currsize=3)

cache['d'] = 4
print(cache)
# LRUCache({'b': 2, 'c': 3, 'd': 4}, maxsize=3, currsize=3)

也可以装饰到函数上，结合 maxsize 参数，缓存保留最近 n 次的调用结果（超过 maxsize 的调用缓存会失效，从而会重新执行函数后调用）。

python 复制代码

import time
from cachetools import cached, LRUCache


# cache using LRUCache
@cached(cache=LRUCache(maxsize=3))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# Takes 3 seconds
print(my_fun(3))

# Takes no time
print(my_fun(3))

# Takes 2 seconds
print(my_fun(2))

# Takes 1 second
print(my_fun(1))

# Takes 4 seconds
print(my_fun(4))

# Takes no time
print(my_fun(1))

# Takes 3 seconds because maxsize = 3
# and the 3 recent used functions had 1, 2 and 4.
print(my_fun(3))

python 复制代码

Time Taken:  3
I am executed: 3
I am executed: 3

Time Taken:  2
I am executed: 2

Time Taken:  1
I am executed: 1

Time Taken:  4
I am executed: 4
I am executed: 1

Time Taken:  3
I am executed: 3

LRUCache也可以从Python 标准库 functools 中引入后调用。

python 复制代码

import time
from functools import lru_cache


# cache using LRUCache
@lru_cache(3)
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"

TTLCache（Time-To-Live）

TTLCache 设置缓存项的存活时间，过期后自动删除。maxsize 和 ttl 是 TTLCache 类的两个参数，用于设置缓存的最大大小和过期时间。

maxsize：表示缓存的最大数目。这可以帮助限制缓存的大小，防止过多的条目占用内存。
ttl：表示缓存的过期时间，即缓存中的结果可以保存多长时间。

当缓存的结果超过最大保留时长，该缓存被自动删除。

python 复制代码

import time
from cachetools import cached, TTLCache


# cache using TTLCache
@cached(cache=TTLCache(maxsize=3, ttl=10))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# Takes 3 seconds
print(my_fun(3))
# Takes 4 seconds
print(my_fun(4))
time.sleep(2)

# Takes no time
print(my_fun(3))
time.sleep(11)

# Takes 4 seconds because 4 is expired
print(my_fun(4))

python 复制代码

Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4
I am executed: 3

Time Taken:  4
I am executed: 4

当缓存结果超过最大保留数目时，会按照 LRU（最近最少使用）算法删除最久未使用的条目，以腾出空间来存储新的条目。

python 复制代码

import time
from cachetools import cached, TTLCache


# cache using TTLCache
@cached(cache=TTLCache(maxsize=3, ttl=20))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# Takes 1 seconds
print(my_fun(1))
# Takes 2 seconds
print(my_fun(2))
# Takes no time
print(my_fun(2))
# Takes 3 seconds
print(my_fun(3))
# Takes 4 seconds
print(my_fun(4))
# Takes 1 seconds because 1 is expired
print(my_fun(1))

python 复制代码

Time Taken:  1
I am executed: 1

Time Taken:  2
I am executed: 2
I am executed: 2

Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4

Time Taken:  1
I am executed: 1

LFUCache（Least Frequently Used）

LFUCache 最少频繁使用缓存，会移除访问频率最低的项目。在缓存结果数目超过 maxsize 时，它会丢弃最不常调用的以腾出空间。

python 复制代码

import time
from cachetools import cached, LFUCache


# cache using LFUCache
@cached(cache=LFUCache(maxsize=3))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# 模拟多次调用
print(my_fun(1))
print(my_fun(1))

print(my_fun(2))
print(my_fun(2))

print(my_fun(3))

# 新增调用
print(my_fun(4))
print(my_fun(4))

# 3 的缓存已失效
print(my_fun(3))
# 2 的缓存还在
print(my_fun(2))

python 复制代码

Time Taken:  1
I am executed: 1
I am executed: 1

Time Taken:  2
I am executed: 2
I am executed: 2

Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4
I am executed: 4

Time Taken:  3
I am executed: 3
I am executed: 2

RRCache（Random Replacement）

RRCache在缓存结果数目超过 maxsize 时，它会随机丢弃缓存结果以腾出空间。

python 复制代码

import time
from cachetools import cached, RRCache


# cache using RRCache
@cached(cache=RRCache(maxsize=3))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# 模拟多次调用
print(my_fun(1))
print(my_fun(2))
print(my_fun(3))
print(my_fun(4))

# 有时候能命中，有时候不能命中
print(my_fun(1))

如果 1 的缓存没失效，则结果如下。

python 复制代码

Time Taken:  1
I am executed: 1

Time Taken:  2
I am executed: 2

Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4
I am executed: 1

如果 1 的缓存已失效，则结果如下。

python 复制代码

Time Taken:  1
I am executed: 1

Time Taken:  2
I am executed: 2

Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4

Time Taken:  1
I am executed: 1

FIFOCache（First In First Out）

FIFOCache使用先进先出的缓存淘汰策略，与普通队列比较类似。在缓存结果数目超过 maxsize 时，先缓存的结果优先被丢弃。

python 复制代码

import time
from cachetools import cached, FIFOCache


# cache using FIFOCache
@cached(cache=FIFOCache(maxsize=3))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# 模拟多次调用
print(my_fun(1))
print(my_fun(2))
print(my_fun(3))

print(my_fun(4))  # 此时 1 先被丢弃
print(my_fun(2))  # 此时 2 的缓存仍有效
print(my_fun(1))  # 此时 1 的缓存已失效，需要重新执行

python 复制代码

Time Taken:  1
I am executed: 1

Time Taken:  2
I am executed: 2

Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4
I am executed: 2

Time Taken:  1
I am executed: 1

Python 之 cachetools 缓存工具

安装

Cache

LRUCache**（Least Recently Used）**

TTLCache（Time-To-Live）

LFUCache（Least Frequently Used）

RRCache（Random Replacement）

FIFOCache（First In First Out）

LRUCache（Least Recently Used）