Retrieval-Augmented Generation for LargeLanguage Models: A Survey

标题:Retrieval-Augmented Generation for Large Language Models: A Survey

作者:Yunfan Gaoa , Yun Xiongb , Xinyu Gaob , Kangxiang Jiab , Jinliu Panb , Yuxi Bic , Yi Daia , Jiawei Suna , Meng Wangc , and Haofen Wang

  1. By referencing external knowledge, RAG effectively reduces the problem of generating factually incorrect content. Its integration into LLMs has resulted in widespread adoption, establishing RAG as a key technology in advancing chatbots and enhancing the suitability of LLMs for real-world applications

  2. The RAG research paradigm is continuously evolving, and we categorize it into three stages: Naive RAG, Advanced RAG, and Modular RAG

  3. The Naive RAG:

Indexing starts with the cleaning and extraction of raw data

Retrieval. Upon receipt of a user query, the RAG system employs the same encoding model utilized during the indexing phase to transform the query into a vector representation.

Generation. The posed query and selected documents are synthesized into a coherent prompt to which a large language model is tasked with formulating a response.

Advanced RAG introduces specific improvements to overcome the limitations of Naive RAG. Focusing on enhancing retrieval quality, it employs pre-retrieval and post-retrieval strategies.

Pre-retrieval process. In this stage, the primary focus is on optimizing the indexing structure and the original query. The goal of optimizing indexing is to enhance the quality of the content being indexed.

Post-Retrieval Process. Once relevant context is retrieved, it's crucial to integrate it effectively with the query

  1. Innovations such as the Rewrite-Retrieve-Read [7]model leverage the LLM's capabilities to refine retrieval queries through a rewriting module and a LM-feedback mechanism to update rewriting model

  2. RAG is often compared with Fine-tuning (FT) and prompt engineering. Each method has distinct characteristics as illustrated in Figure 4.

  3. In the context of RAG, it is crucial to efficiently retrieve relevant documents from the data source. There are several key issues involved, such as the retrieval source, retrieval granularity, pre-processing of the retrieval, and selection of the corresponding embedding model.

相关推荐
我爱一条柴ya4 分钟前
【AI大模型】神经网络反向传播:核心原理与完整实现
人工智能·深度学习·神经网络·ai·ai编程
y_y_liang28 分钟前
图生生AI商品换背景,高效商拍!
大数据·人工智能·ai·ai作画
敖行客 Allthinker2 小时前
云原生安全观察:零信任架构与动态防御的下一代免疫体系
安全·ai·云原生·架构·kubernetes·ebpf
带刺的坐椅5 小时前
Java MCP 实战:构建跨进程与远程的工具服务
java·ai·solon·mcp
小付爱coding5 小时前
SpringAIAlibaba正式版发布!
ai
令狐少侠20118 小时前
ai之对接电信ds后端服务,通过nginx代理转发https为http,对外请求,保持到达第三方后请求头不变
nginx·ai·https
产品经理独孤虾16 小时前
人工智能大模型如何助力电商产品经理打造高效的商品工业属性画像
人工智能·机器学习·ai·大模型·产品经理·商品画像·商品工业属性
海豚调度1 天前
Linux 基金会报告解读:开源 AI 重塑经济格局,有人失业,有人涨薪!
大数据·人工智能·ai·开源
令狐少侠20111 天前
ai之RAG本地知识库--基于OCR和文本解析器的新一代RAG引擎:RAGFlow 认识和源码剖析
人工智能·ai
小屁妞1 天前
Spring AI Alibaba智能测试用例生成
ai·测试用例生成·ai生成测试用例