AI 大模型应用开发前置知识：Python 类型注解全教程

前言

在 FastAPI、LangChain、LlamaIndex、RAG、大模型服务端 开发中，类型注解（Type Hints） 已经不是可选语法，而是：

代码可读性基础
多人协作规范
静态检查（mypy）依据
自动生成API文档的核心
避免80%运行时错误的保障

本文 100%全覆盖、系统、详细、无遗漏 讲解 Python 类型注解所有知识点，从基础到高级，从语法到工程实践，全部一次性讲清。

一、类型注解基础

1.1 什么是类型注解？

类型注解是 Python 3.5+ 引入的静态类型标记，用于：

标记变量类型
标记函数参数类型
标记函数返回值类型

特点：

不影响程序运行
仅用于提示、检查、文档
被 IDE / mypy / 框架识别

1.2 为什么大模型开发必须学？

函数参数极多（prompt、temperature、stream、tools...）
数据结构复杂（Document、Embedding、Message...）
嵌套深、逻辑复杂，无类型完全无法维护
FastAPI / LangChain 强制依赖类型注解

二、变量类型注解（完整）

2.1 基础类型

python 复制代码

# 数字
name: str = "大模型开发者"
age: int = 25
score: float = 98.5
is_active: bool = True
nothing: None = None

2.2 只声明不赋值

python 复制代码

prompt: str
temperature: float
max_tokens: int
stream: bool

2.3 常量注解 Final

python 复制代码

from typing import Final

API_KEY: Final[str] = "sk-xxxxxx"
MODEL_NAME: Final[str] = "gpt-3.5-turbo"

三、容器类型注解（list / dict / tuple / set）

Python 3.9+ 内置支持，无需从 typing 导入 List/Dict。

3.1 list 列表

python 复制代码

# 整数列表
ids: list[int] = [1001, 1002, 1003]

# 字符串列表
prompts: list[str] = ["介绍AI", "写Python代码"]

# 浮点向量（大模型Embedding必用）
embedding: list[float] = [0.12, 0.35, 0.66, 0.92]

# 嵌套列表
matrix: list[list[float]] = [[1.0, 2.0], [3.0, 4.0]]

3.2 dict 字典

格式：dict[键类型, 值类型]

python 复制代码

# 简单字典
config: dict[str, str] = {"model": "gpt-3.5", "version": "v1"}

# 混合值类型（常用）
model_kwargs: dict[str, int | float | bool] = {
    "temperature": 0.7,
    "max_tokens": 1024,
    "stream": True
}

3.3 tuple 元组

元组是固定结构、固定位置类型。

python 复制代码

# 固定结构：int + str
user: tuple[int, str] = (1001, "AI用户")

# 任意长度同类型
numbers: tuple[int, ...] = (1, 2, 3, 4, 5)

# 混合结构
message: tuple[str, str, bool] = ("user", "你好", True)

3.4 set 集合

python 复制代码

unique_ids: set[int] = {101, 102, 103}
vocab: set[str] = {"AI", "大模型", "RAG"}

四、函数类型注解（最高频、最重要）

4.1 函数参数注解

python 复制代码

def generate_response(
    prompt: str,
    temperature: float,
    max_tokens: int,
    stream: bool
) -> str:
    return f"生成结果：{prompt}"

4.2 返回值注解 -> 类型

python 复制代码

def get_embedding(text: str) -> list[float]:
    return [0.1, 0.2, 0.3]

4.3 无返回值 -> None

python 复制代码

def log_info(message: str) -> None:
    print(f"[日志] {message}")

4.4 默认参数注解

python 复制代码

def chat(
    prompt: str,
    model: str = "gpt-3.5-turbo",
    temperature: float = 0.7
) -> str:
    ...

4.5 可变参数 *args **kwargs

python 复制代码

def call_llm(*args: str, **kwargs: int | float) -> str:
    ...

五、联合类型 | Union（多种可能类型）

5.1 Python 3.10+ 简洁写法（推荐）

python 复制代码

# 可以是 int 或 str
user_id: int | str = 1001

# 可以是 float 或 None
score: float | None = None

5.2 旧版写法（兼容低版本）

python 复制代码

from typing import Union

user_id: Union[int, str] = 1001

六、可选类型 Optional（等于 X | None）

大模型项目90%可选参数都用它。

python 复制代码

from typing import Optional

# 等价于 system_prompt: str | None
system_prompt: Optional[str] = None

api_key: Optional[str] = None

七、Any 类型（任意类型）

python 复制代码

from typing import Any

# 可以是任何类型
data: Any = "字符串"
data: Any = 123
data: Any = [1, 2, 3]

⚠️ 工程规范：尽量不要用 Any，会完全失去类型安全意义。

八、迭代器与生成器类型（大模型流式输出必学）

8.1 Iterator 迭代器

python 复制代码

from typing import Iterator

def count_to_5() -> Iterator[int]:
    for i in range(5):
        yield i

8.2 Generator 生成器（流式输出标准）

格式：

Generator[产出类型, 发送类型, 返回类型]

python 复制代码

from typing import Generator

# 大模型流式输出标准写法
def llm_stream() -> Generator[str, None, None]:
    yield "我"
    yield "是"
    yield "AI"
    yield "大模型"

ChatGPT、文心、通义千问流式返回都用这个类型。

九、Callable 函数类型（回调函数）

用于函数作为参数传递的场景，如：

回调函数
事件处理
大模型插件系统

格式：

Callable[[参数类型...], 返回值类型]

python 复制代码

from typing import Callable

# 回调：接收 str，返回 bool
CallbackFunc = Callable[[str], bool]

def process_result(
    result: str,
    callback: CallbackFunc
) -> None:
    callback(result)

十、Type Alias 类型别名（工程化必备）

用于简化复杂类型，统一项目规范。

python 复制代码

# 向量别名
Embedding = list[float]

# 文档别名
Document = dict[str, str | Embedding]

# 消息列表
MessageList = list[dict[str, str]]

# 使用
emb: Embedding = [0.1, 0.2, 0.3]
doc: Document = {"content": "你好", "embedding": emb}

十一、Literal 字面量类型（大模型配置神器）

限定变量只能是几个固定值，传错直接报错。

python 复制代码

from typing import Literal

# 模型名称只能是这三种
ModelType = Literal["gpt-3.5", "gpt-4", "qwen", "ernie"]

# 设备类型
Device = Literal["cpu", "cuda", "mps"]

def run_llm(
    model: ModelType,
    device: Device = "cpu"
) -> None:
    ...

十二、TypedDict 结构化字典（AI 项目核心）

用于注解结构固定的字典，如：

大模型请求/返回体
RAG 文档结构
对话消息格式

python 复制代码

from typing import TypedDict

class ChatMessage(TypedDict):
    role: str          # user / assistant / system
    content: str
    name: Optional[str]

# 必须严格按结构写，少字段/错类型都会报错
msg: ChatMessage = {
    "role": "user",
    "content": "你好"
}

可选字段 TypedDict

python 复制代码

class LLMConfig(TypedDict, total=False):
    model: str
    temperature: float
    max_tokens: int

十三、Class 类与对象注解

13.1 实例属性注解

python 复制代码

class LLMModel:
    # 类属性注解
    model_name: str
    temperature: float

    def __init__(self, model_name: str, temperature: float = 0.7):
        self.model_name = model_name
        self.temperature = temperature

13.2 Self 自身类型（链式调用）

Python 3.11+

python 复制代码

from typing import Self

class LLMBuilder:
    def set_prompt(self, prompt: str) -> Self:
        self.prompt = prompt
        return self

    def set_temperature(self, t: float) -> Self:
        self.temp = t
        return self

13.3 类本身类型 type

ython 复制代码

class BaseLLM:
    pass

def create_llm(llm_class: type[BaseLLM]) -> BaseLLM:
    return llm_class()

十四、异步函数类型注解

async 函数直接注解最终返回值即可。

python 复制代码

import asyncio

async def async_llm_call(prompt: str) -> str:
    await asyncio.sleep(1)
    return f"异步回复：{prompt}"

十五、NoReturn 永不返回

用于永远抛出异常、永远不结束的函数。

python 复制代码

from typing import NoReturn

def raise_error() -> NoReturn:
    raise RuntimeError("大模型调用失败")

十六、Protocol 接口类型（高级）

用于定义接口规范，类似 Java/TypeScript 接口。

python 复制代码

from typing import Protocol

class EmbeddingModel(Protocol):
    def encode(self, text: str) -> list[float]:
        ...

任何实现 encode 方法的类都被视为 EmbeddingModel。

十七、类型注解在大模型项目中的实战综合示例

python 复制代码

from typing import (
    Optional,
    Literal,
    TypedDict,
    Generator,
)

ModelName = Literal["gpt-3.5", "gpt-4", "qwen"]

class ChatMessage(TypedDict):
    role: str
    content: str
    name: Optional[str]

def chat_completion(
    messages: list[ChatMessage],
    model: ModelName = "gpt-3.5",
    temperature: float = 0.7,
    stream: bool = False
) -> Generator[str, None, None] | str:
    if stream:
        yield "回复"
    else:
        return "完整回复"

十八、静态检查工具 mypy（企业级必备）

安装：

Plain 复制代码

pip install mypy

运行检查：

Plain 复制代码

mypy your_code.py --strict

作用：

运行前发现类型错误
保证大项目质量
企业级AI工程标配

十九、类型注解最全总结

基础

str int float bool None
变量：name: str
函数：def fn(a: int) -> str

容器

list[T]
dict[K, V]
tuple[T1, T2, ...]
set[T]

组合类型

可选：T | None / Optional[T]
联合：T1 | T2
任意：Any

高级（AI 必用）

生成器：Generator[Yield, Send, Return]
回调：Callable[[P], R]
结构：TypedDict
枚举：Literal
接口：Protocol
异步：直接注解返回值

工程规范

不用 Any
全部函数加注解
复杂结构用 TypedDict
配置项用 Literal
使用 mypy 检查