cachetools 提供了各种内存缓存的实现。它可以用于函数结果缓存、对象缓存等场景,能够有效提升程序性能,减少重复计算。
安装
pip install cachetools
Cache
Cache 可变映射作为缓存的简单缓存,也可以直接传入空字典 {}。 以递归求解斐波拉契结果为例,我们可以看下使用缓存和不使用缓存的时间消耗对比。
python
import time
from cachetools import cached, Cache
# without cached
def fib(n):
if n < 2:
return n
return fib(n - 1) + fib(n - 2)
start_time = time.time()
print(fib(40))
end_time = time.time()
print("Time Taken:", end_time - start_time)
# Now using cached
# Use this decorator to enable caching
@cached(cache=Cache(maxsize=50))
def fib(n):
if n < 2:
return n
return fib(n - 1) + fib(n - 2)
start_time = time.time()
print(fib(40))
end_time = time.time()
print("Time Taken:", end_time - start_time)
python
102334155
Time Taken: 15.232901096343994
102334155
Time Taken: 0.0010046958923339844
从结果可以看出,当求值的输入越大的时候,时间耗费越悬殊。
这源自于缓存会将递归调用的中间结果保存下来,后续递归调用中碰到相同输入的重复调用时,就可以直接使用已缓存的结果,从而提高性能。
LRUCache**(Least Recently Used)**
LRUCache最近最少使用缓存,会优先移除最久未被访问的项目。
python
from cachetools import LRUCache
cache = LRUCache(maxsize=3)
cache['a'] = 1
cache['b'] = 2
cache['c'] = 3
print(cache)
# LRUCache({'a': 1, 'b': 2, 'c': 3}, maxsize=3, currsize=3)
cache['d'] = 4
print(cache)
# LRUCache({'b': 2, 'c': 3, 'd': 4}, maxsize=3, currsize=3)
也可以装饰到函数上,结合 maxsize 参数,缓存保留最近 n 次的调用结果(超过 maxsize 的调用缓存会失效,从而会重新执行函数后调用)。
python
import time
from cachetools import cached, LRUCache
# cache using LRUCache
@cached(cache=LRUCache(maxsize=3))
def my_fun(n):
# This delay resembles some task
start_time = time.time()
time.sleep(n)
end_time = time.time()
print("\nTime Taken: ", int(end_time - start_time))
return f"I am executed: {n}"
# Takes 3 seconds
print(my_fun(3))
# Takes no time
print(my_fun(3))
# Takes 2 seconds
print(my_fun(2))
# Takes 1 second
print(my_fun(1))
# Takes 4 seconds
print(my_fun(4))
# Takes no time
print(my_fun(1))
# Takes 3 seconds because maxsize = 3
# and the 3 recent used functions had 1, 2 and 4.
print(my_fun(3))
python
Time Taken: 3
I am executed: 3
I am executed: 3
Time Taken: 2
I am executed: 2
Time Taken: 1
I am executed: 1
Time Taken: 4
I am executed: 4
I am executed: 1
Time Taken: 3
I am executed: 3
LRUCache也可以从Python 标准库 functools 中引入后调用。
python
import time
from functools import lru_cache
# cache using LRUCache
@lru_cache(3)
def my_fun(n):
# This delay resembles some task
start_time = time.time()
time.sleep(n)
end_time = time.time()
print("\nTime Taken: ", int(end_time - start_time))
return f"I am executed: {n}"
TTLCache(Time-To-Live)
TTLCache 设置缓存项的存活时间,过期后自动删除。maxsize 和 ttl 是 TTLCache 类的两个参数,用于设置缓存的最大大小和过期时间。
- maxsize:表示缓存的最大数目。这可以帮助限制缓存的大小,防止过多的条目占用内存。
- ttl:表示缓存的过期时间,即缓存中的结果可以保存多长时间。
当缓存的结果超过最大保留时长,该缓存被自动删除。
python
import time
from cachetools import cached, TTLCache
# cache using TTLCache
@cached(cache=TTLCache(maxsize=3, ttl=10))
def my_fun(n):
# This delay resembles some task
start_time = time.time()
time.sleep(n)
end_time = time.time()
print("\nTime Taken: ", int(end_time - start_time))
return f"I am executed: {n}"
# Takes 3 seconds
print(my_fun(3))
# Takes 4 seconds
print(my_fun(4))
time.sleep(2)
# Takes no time
print(my_fun(3))
time.sleep(11)
# Takes 4 seconds because 4 is expired
print(my_fun(4))
python
Time Taken: 3
I am executed: 3
Time Taken: 4
I am executed: 4
I am executed: 3
Time Taken: 4
I am executed: 4
当缓存结果超过最大保留数目时,会按照 LRU(最近最少使用)算法删除最久未使用的条目,以腾出空间来存储新的条目。
python
import time
from cachetools import cached, TTLCache
# cache using TTLCache
@cached(cache=TTLCache(maxsize=3, ttl=20))
def my_fun(n):
# This delay resembles some task
start_time = time.time()
time.sleep(n)
end_time = time.time()
print("\nTime Taken: ", int(end_time - start_time))
return f"I am executed: {n}"
# Takes 1 seconds
print(my_fun(1))
# Takes 2 seconds
print(my_fun(2))
# Takes no time
print(my_fun(2))
# Takes 3 seconds
print(my_fun(3))
# Takes 4 seconds
print(my_fun(4))
# Takes 1 seconds because 1 is expired
print(my_fun(1))
python
Time Taken: 1
I am executed: 1
Time Taken: 2
I am executed: 2
I am executed: 2
Time Taken: 3
I am executed: 3
Time Taken: 4
I am executed: 4
Time Taken: 1
I am executed: 1
LFUCache(Least Frequently Used)
LFUCache 最少频繁使用缓存,会移除访问频率最低的项目。在缓存结果数目超过 maxsize 时,它会丢弃最不常调用的以腾出空间。
python
import time
from cachetools import cached, LFUCache
# cache using LFUCache
@cached(cache=LFUCache(maxsize=3))
def my_fun(n):
# This delay resembles some task
start_time = time.time()
time.sleep(n)
end_time = time.time()
print("\nTime Taken: ", int(end_time - start_time))
return f"I am executed: {n}"
# 模拟多次调用
print(my_fun(1))
print(my_fun(1))
print(my_fun(2))
print(my_fun(2))
print(my_fun(3))
# 新增调用
print(my_fun(4))
print(my_fun(4))
# 3 的缓存已失效
print(my_fun(3))
# 2 的缓存还在
print(my_fun(2))
python
Time Taken: 1
I am executed: 1
I am executed: 1
Time Taken: 2
I am executed: 2
I am executed: 2
Time Taken: 3
I am executed: 3
Time Taken: 4
I am executed: 4
I am executed: 4
Time Taken: 3
I am executed: 3
I am executed: 2
RRCache(Random Replacement)
RRCache在缓存结果数目超过 maxsize 时,它会随机丢弃缓存结果以腾出空间。
python
import time
from cachetools import cached, RRCache
# cache using RRCache
@cached(cache=RRCache(maxsize=3))
def my_fun(n):
# This delay resembles some task
start_time = time.time()
time.sleep(n)
end_time = time.time()
print("\nTime Taken: ", int(end_time - start_time))
return f"I am executed: {n}"
# 模拟多次调用
print(my_fun(1))
print(my_fun(2))
print(my_fun(3))
print(my_fun(4))
# 有时候能命中,有时候不能命中
print(my_fun(1))
如果 1 的缓存没失效,则结果如下。
python
Time Taken: 1
I am executed: 1
Time Taken: 2
I am executed: 2
Time Taken: 3
I am executed: 3
Time Taken: 4
I am executed: 4
I am executed: 1
如果 1 的缓存已失效,则结果如下。
python
Time Taken: 1
I am executed: 1
Time Taken: 2
I am executed: 2
Time Taken: 3
I am executed: 3
Time Taken: 4
I am executed: 4
Time Taken: 1
I am executed: 1
FIFOCache(First In First Out)
FIFOCache使用先进先出的缓存淘汰策略,与普通队列比较类似。在缓存结果数目超过 maxsize 时,先缓存的结果优先被丢弃。
python
import time
from cachetools import cached, FIFOCache
# cache using FIFOCache
@cached(cache=FIFOCache(maxsize=3))
def my_fun(n):
# This delay resembles some task
start_time = time.time()
time.sleep(n)
end_time = time.time()
print("\nTime Taken: ", int(end_time - start_time))
return f"I am executed: {n}"
# 模拟多次调用
print(my_fun(1))
print(my_fun(2))
print(my_fun(3))
print(my_fun(4)) # 此时 1 先被丢弃
print(my_fun(2)) # 此时 2 的缓存仍有效
print(my_fun(1)) # 此时 1 的缓存已失效,需要重新执行
python
Time Taken: 1
I am executed: 1
Time Taken: 2
I am executed: 2
Time Taken: 3
I am executed: 3
Time Taken: 4
I am executed: 4
I am executed: 2
Time Taken: 1
I am executed: 1