Retrieval-Augmented Generation for LargeLanguage Models: A Survey

标题:Retrieval-Augmented Generation for Large Language Models: A Survey

作者:Yunfan Gaoa , Yun Xiongb , Xinyu Gaob , Kangxiang Jiab , Jinliu Panb , Yuxi Bic , Yi Daia , Jiawei Suna , Meng Wangc , and Haofen Wang

  1. By referencing external knowledge, RAG effectively reduces the problem of generating factually incorrect content. Its integration into LLMs has resulted in widespread adoption, establishing RAG as a key technology in advancing chatbots and enhancing the suitability of LLMs for real-world applications

  2. The RAG research paradigm is continuously evolving, and we categorize it into three stages: Naive RAG, Advanced RAG, and Modular RAG

  3. The Naive RAG:

Indexing starts with the cleaning and extraction of raw data

Retrieval. Upon receipt of a user query, the RAG system employs the same encoding model utilized during the indexing phase to transform the query into a vector representation.

Generation. The posed query and selected documents are synthesized into a coherent prompt to which a large language model is tasked with formulating a response.

Advanced RAG introduces specific improvements to overcome the limitations of Naive RAG. Focusing on enhancing retrieval quality, it employs pre-retrieval and post-retrieval strategies.

Pre-retrieval process. In this stage, the primary focus is on optimizing the indexing structure and the original query. The goal of optimizing indexing is to enhance the quality of the content being indexed.

Post-Retrieval Process. Once relevant context is retrieved, it's crucial to integrate it effectively with the query

  1. Innovations such as the Rewrite-Retrieve-Read [7]model leverage the LLM's capabilities to refine retrieval queries through a rewriting module and a LM-feedback mechanism to update rewriting model

  2. RAG is often compared with Fine-tuning (FT) and prompt engineering. Each method has distinct characteristics as illustrated in Figure 4.

  3. In the context of RAG, it is crucial to efficiently retrieve relevant documents from the data source. There are several key issues involved, such as the retrieval source, retrieval granularity, pre-processing of the retrieval, and selection of the corresponding embedding model.

相关推荐
企服AI产品测评局14 小时前
实测2026安全培训管理新范式:如何以“视觉大模型”破解AI内容生成与跨系统自动化难题?
人工智能·安全·ai·chatgpt·自动化
Jurio.15 小时前
使用.py脚本下载并加载开源大模型LLMs
python·ai·llama
张哈大15 小时前
解密Function Calling:AI Agent工具调用的标准化核心
人工智能·python·ai
搬砖的小码农_Sky15 小时前
特斯拉FSD Supervised(监督版)的技术原理
人工智能·ai·自动驾驶
z2023050815 小时前
RDMA之RoCEv2 无损网络PFC 、DCQCN 和ECN (7)
linux·服务器·网络·人工智能·ai
m0_3801671415 小时前
CoinGlass API vs Glassnode:全面对比分析
人工智能·ai·区块链
@蔓蔓喜欢你15 小时前
WebAssembly入门:让JavaScript跑的更快
人工智能·ai
多年小白16 小时前
复盘】2026年5月21日(周四)
大数据·人工智能·ai·金融·区块链
fruge16 小时前
数字人从演示到场景落地:突破交互瓶颈,走进真实服务
microsoft·ai·交互
Ai.den16 小时前
Windows 安装 MinerU 3.x 实现本地批量解析 PDF
人工智能·windows·ai