Retrieval-Augmented Generation for LargeLanguage Models: A Survey

标题:Retrieval-Augmented Generation for Large Language Models: A Survey

作者:Yunfan Gaoa , Yun Xiongb , Xinyu Gaob , Kangxiang Jiab , Jinliu Panb , Yuxi Bic , Yi Daia , Jiawei Suna , Meng Wangc , and Haofen Wang

  1. By referencing external knowledge, RAG effectively reduces the problem of generating factually incorrect content. Its integration into LLMs has resulted in widespread adoption, establishing RAG as a key technology in advancing chatbots and enhancing the suitability of LLMs for real-world applications

  2. The RAG research paradigm is continuously evolving, and we categorize it into three stages: Naive RAG, Advanced RAG, and Modular RAG

  3. The Naive RAG:

Indexing starts with the cleaning and extraction of raw data

Retrieval. Upon receipt of a user query, the RAG system employs the same encoding model utilized during the indexing phase to transform the query into a vector representation.

Generation. The posed query and selected documents are synthesized into a coherent prompt to which a large language model is tasked with formulating a response.

Advanced RAG introduces specific improvements to overcome the limitations of Naive RAG. Focusing on enhancing retrieval quality, it employs pre-retrieval and post-retrieval strategies.

Pre-retrieval process. In this stage, the primary focus is on optimizing the indexing structure and the original query. The goal of optimizing indexing is to enhance the quality of the content being indexed.

Post-Retrieval Process. Once relevant context is retrieved, it's crucial to integrate it effectively with the query

  1. Innovations such as the Rewrite-Retrieve-Read [7]model leverage the LLM's capabilities to refine retrieval queries through a rewriting module and a LM-feedback mechanism to update rewriting model

  2. RAG is often compared with Fine-tuning (FT) and prompt engineering. Each method has distinct characteristics as illustrated in Figure 4.

  3. In the context of RAG, it is crucial to efficiently retrieve relevant documents from the data source. There are several key issues involved, such as the retrieval source, retrieval granularity, pre-processing of the retrieval, and selection of the corresponding embedding model.

相关推荐
A达峰绮4 小时前
AI时代的行业重构:机遇、挑战与生存法则
大数据·人工智能·经验分享·ai·推荐算法
lgbisha6 小时前
华为云Flexus+DeepSeek征文|体验华为云ModelArts快速搭建Dify-LLM应用开发平台并创建自己的AI-Agent
人工智能·ai·华为云
小白跃升坊6 小时前
【干货分享】手把手教你实现AI应用对话批量自动化测试(含源码)
ai·大语言模型·maxkb
阿灿爱分享8 小时前
AI换衣技术实现原理浅析:基于图像合成的虚拟试衣实践
ai·ai编程·免费开源
seventeennnnn10 小时前
Java面试实战:Spring Boot+微服务+AI的谢飞机闯关之路 | CSDN博客精选
spring boot·redis·spring cloud·微服务·ai·java面试·rag
组合缺一12 小时前
Java Solon v3.3.2 发布(可替换,美国博通公司的 Spring 方案)
java·spring·ai·solon·solon-flow
栗子不爱栗子1 天前
从理解AI到驾驭文字:一位技术爱好者的写作工具探索手记
python·学习·ai
SuperHeroWu71 天前
【AI大模型入门指南】概念与专有名词详解 (一)
人工智能·ai·大模型·入门·概念
chanalbert1 天前
AI Agent核心技术深度解析:Function Calling与ReAct对比报告
人工智能·ai·语言模型
不叫猫先生1 天前
Bright Data网页抓取工具实战:BOSS直聘爬虫 + PandasAI分析洞察前端岗位市场趋势
爬虫·python·ai·代理