Functools -- lru_cache

🟦 1. lru_cache 是什么？

lru_cache 是 Python 内置的函数缓存装饰器：

自动记忆参数 -> 返回值
下次相同参数直接返回缓存结果
提升性能 10～1000 倍（常见）
实现 LRU：Least Recently Used（最近最少使用淘汰）

来自：

from functools import lru_cache

🟩 2. 最基本用法

from functools import lru_cache

@lru_cache()
def add(a, b):
    print("执行函数…")
    return a + b

print(add(1, 2))
print(add(1, 2))  # 第二次直接使用缓存

输出：

执行函数…
3
3

第二次没有打印“执行函数”，证明缓存命中。

🟦 3. 指定缓存大小

@lru_cache(maxsize=256)
def expensive():
    ...

maxsize=None 表示无限缓存
默认 maxsize=128

🟩 4. 查看缓存信息

print(add.cache_info())

输出类似：

CacheInfo(hits=1, misses=1, maxsize=128, currsize=1)

🟦 5. 清空缓存

add.cache_clear()

🟩 6. 可缓存的数据类型要求（非常重要！）

参数必须是可哈希的（可作为 dict key）
常见可缓存类型：
int
float
str
tuple（内部也必须可哈希）
bool
None

不能缓存：

list
dict
set
自定义对象（除非实现 __hash__）

例如，这会报错：

@lru_cache()
def f(arr: list):
    return sum(arr)

解决方式：

✅ 用 tuple 替代：

@lru_cache()
def f(arr: tuple):
    return sum(arr)

🟦 7. 缓存常用场景（工程实践）

① 配置文件加载（最常见）

@lru_cache()
def load_config(path: str) -> dict:
    with open(path) as f:
        return json.load(f)

无论调用多少次，都只加载一次。

② 网络数据缓存（你的爬虫会用到）

@lru_cache(maxsize=256)
def fetch_stock(code: str):
    response = requests.get(f"https://api.example.com/{code}")
    return response.json()

减少 API 调用量，提高速度。

③ 转换函数（你自己的 stock.coderename 非常适用）

@lru_cache()
def coderename(code: str, restore=False):
    ...

股票代码格式转换非常频繁，缓存收益巨大。

④ 数据字典读取（如行业分类、板块映射）

@lru_cache()
def get_industry_map():
    return db.query("select * from industry")

⑤ 正则表达式编译缓存（推荐）

@lru_cache()
def get_pattern(p):
    return re.compile(p)

🟦 8. 异步函数不能直接用 lru_cache

这是很多人踩坑的点。

错误：

@lru_cache()
async def get_data():
    ...

会报错，因为 async function 返回 coroutine，不可缓存。

✅ 解决方案：缓存同步的包装函数

def cache_wrapper(key):
    return fetch_sync(key)

@lru_cache()
def cached_fetch(key):
    return cache_wrapper(key)

async def fetch_async(key):
    return await anyio.to_thread.run_sync(cached_fetch, key)

如果你项目需要，我可以给你完全封装的 async + lru_cache 兼容方案。

🟩 9. LRU 淘汰机制（简要）

缓存满了时
淘汰最久未使用的缓存项
访问一次就会刷新“使用时间”

🟦 10. 配置 maxsize 的经验规则

场景	推荐
小缓存（配置、正则等）	maxsize=None 或 256
外部 API 缓存	1024 或 4096
重 CPU 计算	128 或 256
需要严格控制内存	64 或 128

🟩 11. 高级：lru_cache + 参数转换（实战技巧）

当参数不可哈希（如 list、dict）时，自定义转换：

def normalize_args(func):
    @functools.wraps(func)
    def wrapper(data):
        key = tuple(sorted(data.items()))
        return func(key)
    return wrapper

@lru_cache()
@normalize_args
def process(data_key):
    ...

使 dict 可缓存。

🟦 12. 最佳实践总结（收藏）

✅ 用在：

配置
转换函数
正则编译
频繁重复计算
网络请求
数据库小表查询
股票数据处理

✅ 避免：

参数是 list/dict
返回大量数据（占内存）
参数混乱（缓存命中率低）

✅ 工程建议：

必须监控 cache_info()
定期 cache_clear()（根据业务）

🟩 13. 最后给你一个实战模板（通用版）

from functools import lru_cache

@lru_cache(maxsize=1024)
def get_stock_name(code: str) -> str:
    """
    缓存股票名称，避免重复请求数据库/接口。
    """
    print(f"查询股票名称: {code}")
    return db.query_stock_name(code)

调用：

print(get_stock_name("600000"))
print(get_stock_name("600000"))   # 使用缓存
print(get_stock_name.cache_info())