从输入到决策：意图识别在 AI 架构中的定位与应用

从输入到决策：意图识别在 AI 架构中的定位与应用 --- 第三章《意图分类》

本章定位

markdown 复制代码

用户原始输入
    │
    ▼
① 输入预处理             ← 第一章
    │
    ▼
② 安全护栏               ← 第二章
    │
    ▼
③ 意图分类（本章）        ← 判断用户想干什么 + 打置信度
    │
    ▼
④~⑥ 实体提取 → 路由     ← 后续章节

意图分类是整条链路的核心决策点。分类结果直接决定请求走哪条业务线路，分错了后面全错。

一、意图体系设计

1.1 为什么要先设计意图体系

写代码之前，第一件事是和业务团队一起定义"用户到底有哪些意图"。这个体系决定了：

分类器要输出什么标签
后续有多少条业务处理链路
Few-shot 示例怎么写

常见错误：技术团队自己拍脑袋定了 5 个意图就开始写代码，上线后发现业务场景远比想的复杂，不断打补丁。

1.2 多级分类树（企业标准做法）

企业项目的意图不是扁平的几个标签，而是多级分类树。L1 决定路由到哪个业务模块，L2 决定模块内走哪条处理链路：

yaml 复制代码

L1（一级意图 / 领域）
├── 售前咨询
│   ├── L2: 产品功能咨询
│   ├── L2: 价格咨询
│   ├── L2: 库存查询
│   └── L2: 促销活动咨询
├── 售后服务
│   ├── L2: 退款
│   ├── L2: 换货
│   ├── L2: 维修
│   ├── L2: 物流查询
│   └── L2: 发票问题
├── 投诉建议
│   ├── L2: 产品质量投诉
│   ├── L2: 服务态度投诉
│   └── L2: 建议反馈
├── 账号问题
│   ├── L2: 登录异常
│   ├── L2: 密码重置
│   └── L2: 账号注销
└── 闲聊 / 其他
    ├── L2: 问候
    ├── L2: 感谢
    └── L2: 无法识别

1.3 意图体系设计的实操原则

原则	说明	反例
互斥	同一层级的意图之间不应有交叉	"退款"和"订单问题"有交叉，应该让"退款"归属于"订单问题"下
完备	必须有兜底意图（"无法识别"/"其他"）	没有兜底意图，遇到未知输入系统会崩溃
粒度适中	L1 控制在 5~~8 个，L2 每组控制在 3~~6 个	L1 有 20 个意图 → 分类准确率会急剧下降
业务驱动	意图名称和业务术语一致，不用技术术语	用 "intent_type_3" 而不是 "退款"
可演化	预留扩展位，新业务上线时加 L2 不需要重构 L1	把所有意图打平成一层，加新意图要改全局分类逻辑

1.4 用配置文件管理意图体系

意图体系应该是可配置的，不要硬编码在代码里：

yaml 复制代码

# intent_schema.yaml
intents:
  - l1: "售前咨询"
    l2:
      - name: "产品功能咨询"
        examples: ["这个产品有什么功能", "支持蓝牙吗", "防水吗"]
        required_slots: []
      - name: "价格咨询"
        examples: ["多少钱", "有优惠吗", "最低什么价"]
        required_slots: ["product"]
      - name: "库存查询"
        examples: ["还有货吗", "什么时候补货", "有现货吗"]
        required_slots: ["product"]

  - l1: "售后服务"
    l2:
      - name: "退款"
        examples: ["我要退款", "申请退货退款", "钱什么时候退"]
        required_slots: ["order_id"]
        risk_level: "high"
      - name: "换货"
        examples: ["换一个新的", "我要换货", "能换个颜色吗"]
        required_slots: ["order_id", "product"]
      - name: "物流查询"
        examples: ["快递到哪了", "什么时候发货", "物流单号多少"]
        required_slots: ["order_id"]

  - l1: "投诉建议"
    l2:
      - name: "产品质量投诉"
        examples: ["质量太差了", "用了两天就坏了", "跟描述不符"]
        required_slots: ["product"]
        risk_level: "high"
      - name: "服务态度投诉"
        examples: ["客服态度很差", "没人理我", "被挂电话了"]
        required_slots: []
        risk_level: "high"

  - l1: "账号问题"
    l2:
      - name: "登录异常"
        examples: ["登不上去", "登录失败", "一直提示密码错误"]
        required_slots: []
      - name: "密码重置"
        examples: ["忘记密码了", "怎么改密码", "重置密码"]
        required_slots: []

  - l1: "闲聊"
    l2:
      - name: "问候"
        examples: ["你好", "在吗", "hi"]
        required_slots: []
      - name: "无法识别"
        examples: []
        required_slots: []

python 复制代码

import yaml

def load_intent_schema(path: str = "intent_schema.yaml") -> dict:
    """加载意图体系配置"""
    with open(path, 'r', encoding='utf-8') as f:
        schema = yaml.safe_load(f)

    # 构建快速查找结构
    intent_map = {}
    all_examples = []
    for l1_group in schema["intents"]:
        l1 = l1_group["l1"]
        intent_map[l1] = {}
        for l2_item in l1_group["l2"]:
            l2 = l2_item["name"]
            intent_map[l1][l2] = {
                "examples": l2_item.get("examples", []),
                "required_slots": l2_item.get("required_slots", []),
                "risk_level": l2_item.get("risk_level", "normal"),
            }
            for ex in l2_item.get("examples", []):
                all_examples.append({"text": ex, "l1": l1, "l2": l2})

    return {
        "intent_map": intent_map,
        "all_examples": all_examples,
        "raw_schema": schema,
    }

二、三种技术方案

2.1 方案对比总览

方案	原理	延迟	成本	准确率	冷启动	适用阶段
LLM 结构化输出	Prompt + JSON Mode	200~800ms	高	85~95%	零训练即可上线	项目初期、意图体系不稳定
Embedding + 分类器	文本向量化 → 传统 ML 分类	10~50ms	极低	80~92%	需要标注数据	意图稳定、高并发
微调小模型	BERT/小型 LLM Fine-tune	10~30ms	极低	90~97%	需要大量标注数据	成熟产品、追求极致效果

企业项目的典型演进路径：

markdown 复制代码

阶段 1（0~3 个月）：LLM 结构化输出
    → 快速上线，验证意图体系是否合理
    → 收集真实数据，积累标注样本

阶段 2（3~6 个月）：Embedding + 分类器
    → 有了几千条标注数据，训练分类器
    → LLM 作为兜底：分类器低置信时回退到 LLM

阶段 3（6 个月+）：微调小模型
    → 有了上万条标注数据，fine-tune BERT / 小型 LLM
    → 延迟最低、成本最低、准确率最高

2.2 方案一：LLM 结构化输出（详细实现）

这是大多数企业项目的起步方案。核心是让 LLM 返回固定 JSON 结构。

Prompt 设计

python 复制代码

def build_classification_prompt(intent_schema: dict) -> str:
    """
    根据意图配置动态构建分类 prompt。
    意图体系变了只需要改配置文件，不用改代码。
    """
    # 从配置中提取意图列表和示例
    intent_descriptions = []
    few_shot_examples = []

    for l1_group in intent_schema["raw_schema"]["intents"]:
        l1 = l1_group["l1"]
        l2_names = [item["name"] for item in l1_group["l2"]]
        intent_descriptions.append(f"- {l1}：{' / '.join(l2_names)}")

        # 每个 L2 取前 2 个 example 做 few-shot
        for l2_item in l1_group["l2"]:
            for ex in l2_item.get("examples", [])[:2]:
                few_shot_examples.append(
                    f'  输入："{ex}" → {{"l1": "{l1}", "l2": "{l2_item["name"]}"}}'
                )

    prompt = f"""你是一个意图分类系统。根据用户输入，判断其意图并返回结构化结果。

## 意图分类体系
{chr(10).join(intent_descriptions)}

## 示例
{chr(10).join(few_shot_examples)}

## 输出格式（严格 JSON）
{{
  "l1": "一级意图",
  "l2": "二级意图",
  "confidence": 0.0 到 1.0 之间的数值,
  "reasoning": "一句话判断依据"
}}

## 规则
1. confidence 必须诚实反映你的确定程度
2. 如果用户输入模糊、信息不足，confidence 应该低（< 0.5）
3. 如果无法归入任何业务意图，l1 为"闲聊"，l2 为"无法识别"
4. 只输出 JSON，不要输出其他内容"""

    return prompt


CLASSIFICATION_PROMPT = None  # 启动时构建一次

def classify_with_llm(
    text: str,
    llm,
    intent_schema: dict,
) -> dict:
    """
    LLM 结构化输出进行意图分类。
    """
    import json

    global CLASSIFICATION_PROMPT
    if CLASSIFICATION_PROMPT is None:
        CLASSIFICATION_PROMPT = build_classification_prompt(intent_schema)

    response = llm.invoke([
        {"role": "system", "content": CLASSIFICATION_PROMPT},
        {"role": "user", "content": f"用户输入：{text}"},
    ])

    try:
        result = json.loads(response.content)
        # 校验返回的意图是否在配置的体系内
        l1 = result.get("l1", "闲聊")
        l2 = result.get("l2", "无法识别")
        if l1 not in intent_schema["intent_map"]:
            l1, l2 = "闲聊", "无法识别"
        elif l2 not in intent_schema["intent_map"].get(l1, {}):
            l2 = list(intent_schema["intent_map"][l1].keys())[0]  # 回退到该 L1 下第一个 L2

        return {
            "l1": l1,
            "l2": l2,
            "confidence": float(result.get("confidence", 0.5)),
            "reasoning": result.get("reasoning", ""),
            "method": "llm",
        }
    except (json.JSONDecodeError, KeyError, ValueError):
        return {
            "l1": "闲聊",
            "l2": "无法识别",
            "confidence": 0.0,
            "reasoning": "LLM 输出解析失败",
            "method": "llm_fallback",
        }

LLM 方案的关键实践

要点	说明
Few-shot 比长描述更有效	每个意图 2~3 个真实样本，比写一段话描述意图更有效
校验输出合法性	LLM 可能返回配置中不存在的意图标签，必须校验并回退
用 JSON Mode	OpenAI 用 `response_format={"type": "json_object"}`，Claude 用 Tool Use，强制 JSON 输出
置信度不完全可靠	LLM 自评的 confidence 倾向于偏高，需要用真实数据校准阈值
记录每次分类结果	输入 + 输出 + confidence 全部记录，用于后续训练分类器

2.3 方案二：Embedding + 分类器（详细实现）

当积累了几千条标注数据后，可以训练一个轻量分类器来替代 LLM，大幅降低延迟和成本。

训练流程

python 复制代码

import numpy as np
from sklearn.svm import SVC
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import cross_val_score
import pickle

# --- 第一步：准备标注数据 ---
# 格式：[{"text": "用户输入", "label": "L1/L2 组合标签"}, ...]
# 来源：前期 LLM 分类的日志 + 人工校正
training_data = [
    {"text": "这个手机多少钱", "label": "售前咨询/价格咨询"},
    {"text": "有什么优惠活动", "label": "售前咨询/促销活动咨询"},
    {"text": "我要退款", "label": "售后服务/退款"},
    {"text": "快递到哪了", "label": "售后服务/物流查询"},
    {"text": "你们服务太差了", "label": "投诉建议/服务态度投诉"},
    # ... 实际需要每个意图 50~200 条样本
]

# --- 第二步：文本向量化 ---
# 使用 Embedding 模型将文本转为向量
# 推荐：OpenAI text-embedding-3-small / BGE / M3E（国内开源）

from langchain_openai import OpenAIEmbeddings
# 或者使用本地模型：
# from sentence_transformers import SentenceTransformer
# embed_model = SentenceTransformer("BAAI/bge-base-zh-v1.5")

embed_model = OpenAIEmbeddings(model="text-embedding-3-small")

texts = [d["text"] for d in training_data]
labels = [d["label"] for d in training_data]

# 批量获取向量
embeddings = embed_model.embed_documents(texts)
X = np.array(embeddings)

# 标签编码
label_encoder = LabelEncoder()
y = label_encoder.fit_transform(labels)

# --- 第三步：训练分类器 ---
# SVM 在小样本文本分类中表现非常好
classifier = SVC(
    kernel="rbf",
    probability=True,  # 需要置信度输出
    C=10,
    gamma="scale",
)
classifier.fit(X, y)

# 交叉验证评估
scores = cross_val_score(classifier, X, y, cv=5, scoring="accuracy")
print(f"交叉验证准确率：{scores.mean():.2%} ± {scores.std():.2%}")

# --- 第四步：保存模型 ---
with open("intent_classifier.pkl", "wb") as f:
    pickle.dump({
        "classifier": classifier,
        "label_encoder": label_encoder,
    }, f)

推理使用

python 复制代码

def classify_with_embedding(
    text: str,
    embed_model,
    classifier,
    label_encoder,
    confidence_threshold: float = 0.5,
) -> dict:
    """
    Embedding + 分类器进行意图分类。
    延迟约 10~50ms（取决于 embedding 模型是本地还是 API）。
    """
    # 文本 → 向量
    embedding = embed_model.embed_query(text)
    X = np.array([embedding])

    # 分类 + 置信度
    prediction = classifier.predict(X)[0]
    probabilities = classifier.predict_proba(X)[0]
    confidence = float(probabilities.max())

    # 解码标签
    label = label_encoder.inverse_transform([prediction])[0]
    l1, l2 = label.split("/") if "/" in label else (label, "无法识别")

    return {
        "l1": l1,
        "l2": l2,
        "confidence": confidence,
        "reasoning": f"分类器 top-1 概率 {confidence:.2%}",
        "method": "embedding_classifier",
        # 附带 top-3 结果，方便调试
        "top3": [
            {
                "label": label_encoder.inverse_transform([i])[0],
                "probability": float(probabilities[i]),
            }
            for i in np.argsort(probabilities)[-3:][::-1]
        ],
    }

2.4 方案三：微调小模型（概要）

当标注数据超过 1 万条时，Fine-tune 一个小模型（如 BERT-base-chinese）可以达到最优效果：

项目	说明
基础模型	`bert-base-chinese`（中文）/ `roberta-base`（英文）/ 或小型 LLM（如 Qwen-1.8B）
训练框架	Hugging Face Transformers + Trainer API
训练数据	每个意图 200~1000 条标注样本
训练时间	单 GPU 约 10~30 分钟（BERT 级别）
推理延迟	5~~15ms（GPU）/ 20~~50ms（CPU）
部署方式	TorchServe / Triton / ONNX Runtime / 直接加载

微调方案代码量较大且涉及模型训练基础设施，此处不展开完整代码。核心步骤：

markdown 复制代码

1. 准备数据 → HuggingFace Dataset 格式
2. 加载预训练模型 → AutoModelForSequenceClassification
3. 配置 TrainingArguments → 学习率、batch_size、epochs
4. Trainer.train() → 训练
5. 评估 → accuracy / F1 / confusion matrix
6. 导出 → ONNX / TorchScript → 部署

三、生产级混合方案：分类器 + LLM 兜底

3.1 架构设计

实际生产环境不会只用一种方案，而是分类器为主、LLM 兜底：

markdown 复制代码

用户输入
    │
    ▼
Embedding + 分类器（10~50ms，零边际成本）
    │
    ├── 高置信（>= 0.8）→ 直接使用分类器结果
    │
    ├── 中置信（0.4~0.8）→ 调用 LLM 二次确认
    │                       │
    │                       ├── LLM 结果与分类器一致 → 使用，提升置信度
    │                       └── LLM 结果不一致 → 使用 LLM 结果（更可靠）
    │
    └── 低置信（< 0.4）→ 直接调用 LLM 分类

3.2 完整实现

python 复制代码

def classify_intent(
    text: str,
    embed_model,
    classifier,
    label_encoder,
    llm,
    intent_schema: dict,
    high_threshold: float = 0.8,
    low_threshold: float = 0.4,
) -> dict:
    """
    生产级意图分类：分类器为主，LLM 兜底。

    策略：
    - 分类器高置信 → 直接用
    - 分类器中置信 → LLM 二次确认
    - 分类器低置信 → LLM 全权决定
    - 分类器不可用 → 降级为纯 LLM
    """

    # --- 第一步：尝试分类器 ---
    classifier_result = None
    try:
        classifier_result = classify_with_embedding(
            text, embed_model, classifier, label_encoder,
        )
    except Exception as e:
        # 分类器异常，降级为纯 LLM
        print(f"[意图分类] 分类器异常：{e}，降级为 LLM")

    # --- 第二步：根据置信度决定策略 ---

    # 情况 1：分类器不可用 → 纯 LLM
    if classifier_result is None:
        return classify_with_llm(text, llm, intent_schema)

    confidence = classifier_result["confidence"]

    # 情况 2：高置信 → 直接使用分类器结果
    if confidence >= high_threshold:
        return classifier_result

    # 情况 3：低置信 → 直接使用 LLM
    if confidence < low_threshold:
        llm_result = classify_with_llm(text, llm, intent_schema)
        llm_result["reasoning"] = f"分类器低置信({confidence:.2%})，使用 LLM 结果。{llm_result['reasoning']}"
        return llm_result

    # 情况 4：中置信 → LLM 二次确认
    llm_result = classify_with_llm(text, llm, intent_schema)

    if llm_result["l1"] == classifier_result["l1"] and llm_result["l2"] == classifier_result["l2"]:
        # 两者一致 → 提升置信度
        merged_confidence = min(0.95, confidence + 0.15)
        return {
            **classifier_result,
            "confidence": merged_confidence,
            "reasoning": f"分类器({confidence:.2%})+LLM 一致确认",
            "method": "ensemble_agree",
        }
    else:
        # 两者不一致 → 信任 LLM（语义理解更强）
        return {
            **llm_result,
            "reasoning": f"分类器({classifier_result['l1']}/{classifier_result['l2']}, {confidence:.2%})与 LLM 不一致，使用 LLM 结果",
            "method": "ensemble_llm_override",
        }

3.3 LangGraph 节点集成

python 复制代码

from typing import TypedDict
from langgraph.graph import StateGraph, START, END


class IntentState(TypedDict):
    processed_input: str     # 来自第一章的预处理结果
    pii_masked_input: str    # 来自第二章的脱敏结果
    # --- 分类结果 ---
    l1_intent: str
    l2_intent: str
    confidence: float
    reasoning: str
    classification_method: str


def intent_classification_node(state: IntentState) -> dict:
    """
    LangGraph 意图分类节点。
    """
    # 优先使用脱敏后的输入（如果有），否则用预处理后的输入
    text = state.get("pii_masked_input") or state["processed_input"]

    result = classify_intent(
        text=text,
        embed_model=embed_model,
        classifier=classifier,
        label_encoder=label_encoder,
        llm=classification_llm,
        intent_schema=intent_schema,
    )

    return {
        "l1_intent": result["l1"],
        "l2_intent": result["l2"],
        "confidence": result["confidence"],
        "reasoning": result["reasoning"],
        "classification_method": result["method"],
    }

四、Few-shot 示例设计

4.1 为什么 Few-shot 比描述更重要

在 LLM 分类方案中，Few-shot 示例对准确率的影响远大于意图的文字描述。原因：

LLM 通过"模式匹配"理解任务，具体样本比抽象描述更容易匹配
边界 case 必须通过示例才能教会模型

4.2 Few-shot 选择策略

策略	做法	适用场景
静态 Few-shot	每个意图固定 2~3 个示例，写死在 prompt 中	简单业务、意图少
动态 Few-shot	根据用户输入，从样本库中检索最相似的 N 个示例	意图多、边界模糊

动态 Few-shot 实现

python 复制代码

import numpy as np

class DynamicFewShotSelector:
    """
    根据用户输入动态选择最相关的 Few-shot 示例。
    原理：用 Embedding 计算相似度，选 top-K 最相似的样本。
    """

    def __init__(self, embed_model, examples: list[dict]):
        """
        examples: [{"text": "输入", "l1": "售前咨询", "l2": "价格咨询"}, ...]
        """
        self.embed_model = embed_model
        self.examples = examples
        # 预计算所有样本的向量（启动时算一次）
        texts = [ex["text"] for ex in examples]
        self.embeddings = np.array(embed_model.embed_documents(texts))

    def select(self, query: str, k: int = 6, max_per_intent: int = 2) -> list[dict]:
        """
        选择 k 个最相关的示例，每个意图最多 max_per_intent 个（保证多样性）。
        """
        query_embedding = np.array(self.embed_model.embed_query(query))

        # 余弦相似度
        similarities = np.dot(self.embeddings, query_embedding) / (
            np.linalg.norm(self.embeddings, axis=1) * np.linalg.norm(query_embedding)
        )

        # 按相似度排序
        sorted_indices = np.argsort(similarities)[::-1]

        # 选择，限制每个意图的数量
        selected = []
        intent_counts = {}
        for idx in sorted_indices:
            if len(selected) >= k:
                break
            ex = self.examples[idx]
            intent_key = f"{ex['l1']}/{ex['l2']}"
            if intent_counts.get(intent_key, 0) >= max_per_intent:
                continue
            selected.append({**ex, "similarity": float(similarities[idx])})
            intent_counts[intent_key] = intent_counts.get(intent_key, 0) + 1

        return selected

    def build_few_shot_text(self, query: str, k: int = 6) -> str:
        """构建 Few-shot 文本块，直接嵌入 prompt"""
        examples = self.select(query, k)
        lines = []
        for ex in examples:
            lines.append(f'  输入："{ex["text"]}" → {{"l1": "{ex["l1"]}", "l2": "{ex["l2"]}"}}')
        return "\n".join(lines)

4.3 Few-shot 示例的质量要求

要求	说明
来自真实数据	不要自己编造，从真实用户输入中选取
包含边界 case	"退款怎么操作"（售后/退款）vs "退款政策是什么"（售前/产品功能咨询）
长短混合	短的"我要退款"和长的"我上周买的蓝牙耳机有问题想退款"都要有
口语化	包含口语表达"咋回事""搞啥"，不要全是书面语
定期更新	每月从新数据中刷新示例库，保持时效性

五、分类结果校验与日志

5.1 校验分类结果

LLM 返回的结果必须校验，防止幻觉标签：

python 复制代码

def validate_classification(result: dict, intent_schema: dict) -> dict:
    """
    校验分类结果是否合法。
    - 检查 l1/l2 是否在配置的意图体系中
    - 检查 confidence 是否在合理范围
    - 不合法时回退到"无法识别"
    """
    l1 = result.get("l1", "")
    l2 = result.get("l2", "")
    confidence = result.get("confidence", 0)

    intent_map = intent_schema["intent_map"]

    # 校验 L1
    if l1 not in intent_map:
        return {**result, "l1": "闲聊", "l2": "无法识别", "confidence": 0.0,
                "reasoning": f"L1 '{l1}' 不在意图体系中，回退"}

    # 校验 L2
    if l2 not in intent_map[l1]:
        valid_l2s = list(intent_map[l1].keys())
        return {**result, "l2": valid_l2s[0] if valid_l2s else "无法识别",
                "confidence": confidence * 0.7,
                "reasoning": f"L2 '{l2}' 不在 '{l1}' 下，回退到 '{valid_l2s[0]}'"}

    # 校验 confidence 范围
    confidence = max(0.0, min(1.0, float(confidence)))

    return {**result, "confidence": confidence}

5.2 分类日志

每次分类都要记录，这些日志是后续优化的核心数据源：

python 复制代码

def log_classification(
    input_text: str,
    result: dict,
    latency_ms: float,
):
    """
    记录分类日志。
    这些日志有两个用途：
    1. 监控分类质量（按置信度分布、按意图分布）
    2. 积累标注数据（低置信的 case 标注后用于训练分类器）
    """
    log_entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "input": input_text[:200],
        "l1": result["l1"],
        "l2": result["l2"],
        "confidence": result["confidence"],
        "method": result["method"],
        "reasoning": result["reasoning"],
        "latency_ms": latency_ms,
    }
    classification_logger.info(json.dumps(log_entry, ensure_ascii=False))

六、异常处理与降级

意图分类层的降级原则：分类失败时返回"无法识别" + 零置信度，让下游的置信度路由自然走入兜底流程。

python 复制代码

def classify_node(state: dict) -> dict:
    """意图分类 LangGraph 节点。"""
    start = time.monotonic()
    text = state.get("pii_masked_input") or state.get("processed_input", "")

    try:
        # ... 正常分类逻辑 ...
        result = {"l1_intent": best_l1, "l2_intent": best_l2, "confidence": confidence, ...}
    except Exception as e:
        logger.exception(f"classify 异常: {e}")
        # 降级：返回"无法识别" + 零置信度 → 下游自动走兜底
        result = {
            "l1_intent": "闲聊",
            "l2_intent": "无法识别",
            "confidence": 0.0,
            "classification_method": "rules_fallback",
            "classification_reasoning": f"分类异常: {e}",
            "top3_intents": [],
            "error_log": [{"layer": "classify", "error": str(e)}],
        }

    elapsed = round((time.monotonic() - start) * 1000, 2)
    result.setdefault("processing_times", {})
    result["processing_times"]["classify"] = elapsed
    return result

混合方案的降级链：分类器异常 → 降级为 LLM；LLM 也异常 → 降级为"无法识别"：

python 复制代码

# 分类器 + LLM 双保险
classifier_result = None
try:
    classifier_result = classify_with_embedding(text, ...)
except Exception:
    logger.warning("分类器异常，降级为 LLM")

if classifier_result is None or classifier_result["confidence"] < threshold:
    try:
        return classify_with_llm(text, ...)
    except Exception:
        logger.warning("LLM 也异常，降级为无法识别")
        return {"l1_intent": "闲聊", "l2_intent": "无法识别", "confidence": 0.0}

七、本章小结

环节	企业级做法
意图体系	多级分类树（L1/L2），配置文件管理，和业务团队共同定义
分类方案	初期用 LLM 结构化输出，积累数据后训练 Embedding + 分类器，LLM 作为兜底
Few-shot	动态检索最相似样本，每个意图 2~3 个真实案例，定期从新数据刷新
结果校验	校验 L1/L2 是否在配置体系内，不合法时回退到"无法识别"
日志记录	每次分类结果全量记录，用于监控 + 积累训练数据
演进路径	LLM → Embedding+分类器 → 微调小模型，分阶段降本增效

从输入到决策：意图识别在 AI 架构中的定位与应用 — 第三章《意图分类》