Retrieval-Augmented Generation for LargeLanguage Models: A Survey

标题:Retrieval-Augmented Generation for Large Language Models: A Survey

作者:Yunfan Gaoa , Yun Xiongb , Xinyu Gaob , Kangxiang Jiab , Jinliu Panb , Yuxi Bic , Yi Daia , Jiawei Suna , Meng Wangc , and Haofen Wang

  1. By referencing external knowledge, RAG effectively reduces the problem of generating factually incorrect content. Its integration into LLMs has resulted in widespread adoption, establishing RAG as a key technology in advancing chatbots and enhancing the suitability of LLMs for real-world applications

  2. The RAG research paradigm is continuously evolving, and we categorize it into three stages: Naive RAG, Advanced RAG, and Modular RAG

  3. The Naive RAG:

Indexing starts with the cleaning and extraction of raw data

Retrieval. Upon receipt of a user query, the RAG system employs the same encoding model utilized during the indexing phase to transform the query into a vector representation.

Generation. The posed query and selected documents are synthesized into a coherent prompt to which a large language model is tasked with formulating a response.

Advanced RAG introduces specific improvements to overcome the limitations of Naive RAG. Focusing on enhancing retrieval quality, it employs pre-retrieval and post-retrieval strategies.

Pre-retrieval process. In this stage, the primary focus is on optimizing the indexing structure and the original query. The goal of optimizing indexing is to enhance the quality of the content being indexed.

Post-Retrieval Process. Once relevant context is retrieved, it's crucial to integrate it effectively with the query

  1. Innovations such as the Rewrite-Retrieve-Read [7]model leverage the LLM's capabilities to refine retrieval queries through a rewriting module and a LM-feedback mechanism to update rewriting model

  2. RAG is often compared with Fine-tuning (FT) and prompt engineering. Each method has distinct characteristics as illustrated in Figure 4.

  3. In the context of RAG, it is crucial to efficiently retrieve relevant documents from the data source. There are several key issues involved, such as the retrieval source, retrieval granularity, pre-processing of the retrieval, and selection of the corresponding embedding model.

相关推荐
3Cloudream15 分钟前
互联网大厂AI大模型应用开发工程师求职面试:技术点深度解析
ai·面试·大模型·应用开发·开发工程师·技术解析
道一云黑板报1 小时前
Spark云原生流处理实战与风控应用
大数据·ai·云原生·spark·kubernetes·ai编程
阿里-于怀10 小时前
携程旅游的 AI 网关落地实践
人工智能·网关·ai·旅游·携程·higress·ai网关
数巨小码人16 小时前
AI+数据库:国内DBA职业发展与国产化转型实践
数据库·人工智能·ai·dba
即兴小索奇1 天前
AI升级社区便民服务:AI办事小程序高效办证+应急系统秒响应,告别跑腿愁住得更安心
ai·商业·ai商业洞察·即兴小索奇
咕咚-萌西1 天前
搭建一个简单的Agent
ai·agent
TDengine (老段)1 天前
TDengine IDMP 应用场景:电动汽车
大数据·数据库·物联网·ai·时序数据库·iot·tdengine
即兴小索奇1 天前
2025年AI Agent规模化落地:企业级市场年增超60%,重构商业作业流程新路径
人工智能·ai·商业·ai商业洞察·即兴小索奇
亲爱的程序猿2 天前
2025 主流 BPM 系统 AI 融合实践全景:大模型适配、核心功能与特色解析
ai·bpm·流程管理软件
网络研究院2 天前
AI代理需要数据完整性
人工智能·ai·数据·代理·完整性