Retrieval-Augmented Generation for LargeLanguage Models: A Survey

标题:Retrieval-Augmented Generation for Large Language Models: A Survey

作者:Yunfan Gaoa , Yun Xiongb , Xinyu Gaob , Kangxiang Jiab , Jinliu Panb , Yuxi Bic , Yi Daia , Jiawei Suna , Meng Wangc , and Haofen Wang

  1. By referencing external knowledge, RAG effectively reduces the problem of generating factually incorrect content. Its integration into LLMs has resulted in widespread adoption, establishing RAG as a key technology in advancing chatbots and enhancing the suitability of LLMs for real-world applications

  2. The RAG research paradigm is continuously evolving, and we categorize it into three stages: Naive RAG, Advanced RAG, and Modular RAG

  3. The Naive RAG:

Indexing starts with the cleaning and extraction of raw data

Retrieval. Upon receipt of a user query, the RAG system employs the same encoding model utilized during the indexing phase to transform the query into a vector representation.

Generation. The posed query and selected documents are synthesized into a coherent prompt to which a large language model is tasked with formulating a response.

Advanced RAG introduces specific improvements to overcome the limitations of Naive RAG. Focusing on enhancing retrieval quality, it employs pre-retrieval and post-retrieval strategies.

Pre-retrieval process. In this stage, the primary focus is on optimizing the indexing structure and the original query. The goal of optimizing indexing is to enhance the quality of the content being indexed.

Post-Retrieval Process. Once relevant context is retrieved, it's crucial to integrate it effectively with the query

  1. Innovations such as the Rewrite-Retrieve-Read [7]model leverage the LLM's capabilities to refine retrieval queries through a rewriting module and a LM-feedback mechanism to update rewriting model

  2. RAG is often compared with Fine-tuning (FT) and prompt engineering. Each method has distinct characteristics as illustrated in Figure 4.

  3. In the context of RAG, it is crucial to efficiently retrieve relevant documents from the data source. There are several key issues involved, such as the retrieval source, retrieval granularity, pre-processing of the retrieval, and selection of the corresponding embedding model.

相关推荐
CoderJia程序员甲20 分钟前
GitHub 热榜项目 - 日榜(2025-12-17)
ai·开源·大模型·github·ai教程
鼎道开发者联盟26 分钟前
鼎道AIGUI元件体系如何让DingOS实现“积木”式交互
人工智能·ui·ai·aigc·交互·gui
哥布林学者12 小时前
吴恩达深度学习课程四:计算机视觉 第二周:经典网络结构 (三)1×1卷积与Inception网络
深度学习·ai
丹宇码农13 小时前
Index-TTS2 从零到一:完整安装与核心使用教程
人工智能·ai·tts
undsky_15 小时前
【RuoYi-SpringBoot3-Pro】:接入 AI 对话能力
人工智能·spring boot·后端·ai·ruoyi
HalvmånEver19 小时前
AI 工具实战测评:从技术性能到场景落地的全方位解析
人工智能·ai
私人珍藏库21 小时前
AutoGLM无需豆包手机,让AI自动帮你点外卖-刷视频
android·ai·智能手机·工具·软件·辅助·autoglm
阿杰学AI21 小时前
AI核心知识55——大语言模型之PE工程师(简洁且通俗易懂版)
人工智能·ai·语言模型·prompt·prompt engineer·提示词工程师·pe工程师
Elastic 中国社区官方博客21 小时前
开始使用 Elastic Agent Builder 和 Strands Agents SDK
大数据·人工智能·elasticsearch·搜索引擎·ai·全文检索
源码技术栈1 天前
智慧工地微服务架构+Java+Spring Cloud +Uni-App +MySql开发,在微信公众号、小程序、H5、移动端
java·ai·saas·智慧工地·智慧工地项目·可视化大屏·智慧工地系统