Python 之 cachetools 缓存工具

cachetools 提供了各种内存缓存的实现。它可以用于函数结果缓存、对象缓存等场景,能够有效提升程序性能,减少重复计算。

安装

复制代码
pip install cachetools

Cache

Cache 可变映射作为缓存的简单缓存,也可以直接传入空字典 {}。 以递归求解斐波拉契结果为例,我们可以看下使用缓存和不使用缓存的时间消耗对比。

python 复制代码
import time
from cachetools import cached, Cache


# without cached
def fib(n):
    if n < 2:
        return n
    return fib(n - 1) + fib(n - 2)


start_time = time.time()
print(fib(40))
end_time = time.time()
print("Time Taken:", end_time - start_time)


# Now using cached
# Use this decorator to enable caching
@cached(cache=Cache(maxsize=50))
def fib(n):
    if n < 2:
        return n
    return fib(n - 1) + fib(n - 2)


start_time = time.time()
print(fib(40))
end_time = time.time()
print("Time Taken:", end_time - start_time)
python 复制代码
102334155
Time Taken: 15.232901096343994
102334155
Time Taken: 0.0010046958923339844

从结果可以看出,当求值的输入越大的时候,时间耗费越悬殊。

这源自于缓存会将递归调用的中间结果保存下来,后续递归调用中碰到相同输入的重复调用时,就可以直接使用已缓存的结果,从而提高性能。

LRUCache**(Least Recently Used)**

LRUCache最近最少使用缓存,会优先移除最久未被访问的项目。

python 复制代码
from cachetools import LRUCache

cache = LRUCache(maxsize=3)

cache['a'] = 1
cache['b'] = 2
cache['c'] = 3
print(cache)
# LRUCache({'a': 1, 'b': 2, 'c': 3}, maxsize=3, currsize=3)

cache['d'] = 4
print(cache)
# LRUCache({'b': 2, 'c': 3, 'd': 4}, maxsize=3, currsize=3)

也可以装饰到函数上,结合 maxsize 参数,缓存保留最近 n 次的调用结果(超过 maxsize 的调用缓存会失效,从而会重新执行函数后调用)。

python 复制代码
import time
from cachetools import cached, LRUCache


# cache using LRUCache
@cached(cache=LRUCache(maxsize=3))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# Takes 3 seconds
print(my_fun(3))

# Takes no time
print(my_fun(3))

# Takes 2 seconds
print(my_fun(2))

# Takes 1 second
print(my_fun(1))

# Takes 4 seconds
print(my_fun(4))

# Takes no time
print(my_fun(1))

# Takes 3 seconds because maxsize = 3
# and the 3 recent used functions had 1, 2 and 4.
print(my_fun(3))
python 复制代码
Time Taken:  3
I am executed: 3
I am executed: 3

Time Taken:  2
I am executed: 2

Time Taken:  1
I am executed: 1

Time Taken:  4
I am executed: 4
I am executed: 1

Time Taken:  3
I am executed: 3

LRUCache也可以从Python 标准库 functools 中引入后调用。

python 复制代码
import time
from functools import lru_cache


# cache using LRUCache
@lru_cache(3)
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"

TTLCache(Time-To-Live)

TTLCache 设置缓存项的存活时间,过期后自动删除。maxsize 和 ttl 是 TTLCache 类的两个参数,用于设置缓存的最大大小和过期时间。

  • maxsize:表示缓存的最大数目。这可以帮助限制缓存的大小,防止过多的条目占用内存。
  • ttl:表示缓存的过期时间,即缓存中的结果可以保存多长时间。

当缓存的结果超过最大保留时长,该缓存被自动删除。

python 复制代码
import time
from cachetools import cached, TTLCache


# cache using TTLCache
@cached(cache=TTLCache(maxsize=3, ttl=10))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# Takes 3 seconds
print(my_fun(3))
# Takes 4 seconds
print(my_fun(4))
time.sleep(2)

# Takes no time
print(my_fun(3))
time.sleep(11)

# Takes 4 seconds because 4 is expired
print(my_fun(4))
python 复制代码
Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4
I am executed: 3

Time Taken:  4
I am executed: 4

当缓存结果超过最大保留数目时,会按照 LRU(最近最少使用)算法删除最久未使用的条目,以腾出空间来存储新的条目。

python 复制代码
import time
from cachetools import cached, TTLCache


# cache using TTLCache
@cached(cache=TTLCache(maxsize=3, ttl=20))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# Takes 1 seconds
print(my_fun(1))
# Takes 2 seconds
print(my_fun(2))
# Takes no time
print(my_fun(2))
# Takes 3 seconds
print(my_fun(3))
# Takes 4 seconds
print(my_fun(4))
# Takes 1 seconds because 1 is expired
print(my_fun(1))
python 复制代码
Time Taken:  1
I am executed: 1

Time Taken:  2
I am executed: 2
I am executed: 2

Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4

Time Taken:  1
I am executed: 1

LFUCache(Least Frequently Used)

LFUCache 最少频繁使用缓存,会移除访问频率最低的项目。在缓存结果数目超过 maxsize 时,它会丢弃最不常调用的以腾出空间。

python 复制代码
import time
from cachetools import cached, LFUCache


# cache using LFUCache
@cached(cache=LFUCache(maxsize=3))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# 模拟多次调用
print(my_fun(1))
print(my_fun(1))

print(my_fun(2))
print(my_fun(2))

print(my_fun(3))

# 新增调用
print(my_fun(4))
print(my_fun(4))

# 3 的缓存已失效
print(my_fun(3))
# 2 的缓存还在
print(my_fun(2))
python 复制代码
Time Taken:  1
I am executed: 1
I am executed: 1

Time Taken:  2
I am executed: 2
I am executed: 2

Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4
I am executed: 4

Time Taken:  3
I am executed: 3
I am executed: 2

RRCache(Random Replacement)

RRCache在缓存结果数目超过 maxsize 时,它会随机丢弃缓存结果以腾出空间。

python 复制代码
import time
from cachetools import cached, RRCache


# cache using RRCache
@cached(cache=RRCache(maxsize=3))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# 模拟多次调用
print(my_fun(1))
print(my_fun(2))
print(my_fun(3))
print(my_fun(4))

# 有时候能命中,有时候不能命中
print(my_fun(1))

如果 1 的缓存没失效,则结果如下。

python 复制代码
Time Taken:  1
I am executed: 1

Time Taken:  2
I am executed: 2

Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4
I am executed: 1

如果 1 的缓存已失效,则结果如下。

python 复制代码
Time Taken:  1
I am executed: 1

Time Taken:  2
I am executed: 2

Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4

Time Taken:  1
I am executed: 1

FIFOCache(First In First Out)

FIFOCache使用先进先出的缓存淘汰策略,与普通队列比较类似。在缓存结果数目超过 maxsize 时,先缓存的结果优先被丢弃。

python 复制代码
import time
from cachetools import cached, FIFOCache


# cache using FIFOCache
@cached(cache=FIFOCache(maxsize=3))
def my_fun(n):
    # This delay resembles some task
    start_time = time.time()
    time.sleep(n)
    end_time = time.time()
    print("\nTime Taken: ", int(end_time - start_time))
    return f"I am executed: {n}"


# 模拟多次调用
print(my_fun(1))
print(my_fun(2))
print(my_fun(3))

print(my_fun(4))  # 此时 1 先被丢弃
print(my_fun(2))  # 此时 2 的缓存仍有效
print(my_fun(1))  # 此时 1 的缓存已失效,需要重新执行
python 复制代码
Time Taken:  1
I am executed: 1

Time Taken:  2
I am executed: 2

Time Taken:  3
I am executed: 3

Time Taken:  4
I am executed: 4
I am executed: 2

Time Taken:  1
I am executed: 1
相关推荐
huohuopro2 小时前
Redis安装和杂谈
数据库·redis·缓存
惊讶的猫2 小时前
redis数据淘汰策略
redis·缓存
he___H14 小时前
Redis高级数据类型
数据库·redis·缓存
惊讶的猫16 小时前
Redis双写一致性
数据库·redis·缓存
老虎062717 小时前
Redis入门,配置,常见面试题总结
数据库·redis·缓存
J&Lu17 小时前
[DDD大营销-Redis]
数据库·redis·缓存
陌上丨20 小时前
如何保证Redis缓存和数据库数据的一致性?
数据库·redis·缓存
晓131321 小时前
第八章:Redis底层原理深度详细解析
数据库·redis·缓存
fengxin_rou1 天前
Redis从零到精通第二篇:redis常见的命令
数据库·redis·缓存