rag学习5 - 技术栈

本周基于 LangChain 框架完成了支持多轮对话的 RAG 检索增强问答服务开发，编写RagService核心类，整合通义大模型、DashScope 向量嵌入模型与自研向量库服务，依托配置文件统一管理模型参数，完成各项基础组件的初始化搭建。

本次开发中设计了专属提示词模板，通过ChatPromptTemplate搭配MessagesPlaceholder预留对话历史位置，明确模型应答要求，优先结合检索到的参考资料作答。同时自定义函数完成检索文档内容、元数据的格式化处理，规范数据展示形式。

项目运用 LangChain 各类 Runnable 组件完成链路编排，借助RunnablePassthrough实现数据透传、RunnableLambda完成自定义数据转换，串联文档检索、数据处理、提示词拼接、模型调用、结果解析全流程，并使用StrOutputParser统一输出格式。

本次实践掌握了 LangChain 链式调用逻辑与 RAG 完整业务流程，通过RunnableWithMessageHistory对接历史会话读取接口，实现问答场景的上下文记忆能力。后续将开展功能测试，优化文档格式化逻辑，并进一步拓展服务适配更多业务场景。

python 复制代码

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough, RunnableWithMessageHistory, RunnableLambda

from file_history_store import get_history
from vectoer_stores import VectorStoreService
from langchain_community.embeddings import DashScopeEmbeddings
import config_data as config
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.chat_models.tongyi import ChatTongyi
from langchain_core.documents import Document

class RagService:
    def __init__(self):
        self.vector_service = VectorStoreService(embedding = DashScopeEmbeddings(model = config.embedding_model_name ))

        self.prompt_template = ChatPromptTemplate.from_messages(
            [
                ("system","以我提供的已知参考资料为主，简洁和专业的回答用户问题。参考资料{context}"),
                ("system", "并且我提供的用户的历史对话如下"),
                MessagesPlaceholder("history"),
                ("user", "请回答用户提问：{input}")
            ]
        )

        self.chat_model = ChatTongyi(model = config.chat_model_name)

        self.chain = self.__get_chain()

    def __get_chain(self) :

        retriever = self.vector_service.get_retriever()

        def format_document(docs: list[Document]) :
            if not docs :
                return "没有参考资料"

            formatted_str = ""
            for doc in docs :
                formatted_str += f"文档片段：{doc.page_content}\n文档原数据：{doc.metadata}\n"

            return formatted_str

        def format_for_retriever(value):
            return value["input"]

        def format_for_prompt_template(value):
            new_dict = {}
            new_dict["input"] = value["input"]["input"]
            new_dict["context"] = value["context"]
            new_dict["history"] = value["input"]["history"]
            return new_dict



        chain = (
            {"input":RunnablePassthrough() ,
             "context":RunnableLambda(format_for_retriever)| retriever | format_document}
            |RunnableLambda(format_for_prompt_template)
            | self.prompt_template
            | self.chat_model
            | StrOutputParser()
        )

        conversation_chain =RunnableWithMessageHistory(
            chain,
            get_history,
            input_messages_key="input",
            history_messages_key="history",
        )


        return conversation_chain