Elasticsearch：没有 “AG” 的 RAG？

作者：来自 Elastic Gustavo Llermaly

了解如何利用语义搜索和 ELSER 构建一个强大且视觉上吸引人的问答体验，而无需使用 LLMs。

想要获得 Elastic 认证？查看下一期 Elasticsearch Engineer 培训的时间！

Elasticsearch 拥有众多新功能，帮助你为你的用例构建最佳搜索解决方案。深入学习我们的示例笔记本，了解更多信息，开始免费云试用，或立即在本地机器上尝试 Elastic。

你可能听说过 RAG（Retrieval Augmented Generation - 检索增强生成），这是一种将你自己的文档与 LLM 结合，以生成类人回答的常用策略。在本文中，我们将探讨如果不使用 LLM，仅依靠语义搜索和 ELSER，能实现多大的效果。

如果你是 RAG 新手，可以阅读这篇关于 Azure 和 AI 上 RAG 的文章，这篇关于使用 Mistral 的 RAG 的文章，或这篇关于使用 Amazon Bedrock 构建 RAG 应用的文章。

大语言模型的缺点

虽然 LLM 能提供类人语言的回答，而不仅仅是返回文档，但在实现它们时需要考虑以下几点：

成本：使用 LLM 服务会按 token 收费，或者如果你想在本地运行模型，则需要专用硬件
延迟：增加 LLM 步骤会增加响应时间
隐私：在使用云端模型时，你的信息会被发送给第三方，涉及到各种隐私问题
管理：引入 LLM 意味着你需要处理一个新的技术组件，包括不同的模型提供商和版本、提示词工程、幻觉等问题

只需要 ELSER 就够了

如果你仔细想想，一个 RAG 系统的效果取决于背后的搜索引擎。虽然搜索后直接读到一个合适的答案比得到一串结果要好，但它的价值体现在：

可以用问题来查询，而不是关键词，这样你不需要文档中有完全相同的词，因为系统 "理解" 意思。
不需要读完整篇文本就能获取所需信息，因为 LLM 会在你提供的上下文中找到答案并将其展示出来。

考虑到这些，我们可以通过使用 ELSER 来进行语义搜索获取相关信息，并对文档进行结构化，使用户在输入问题后，界面能直接引导他们找到答案，而无需阅读全文，从而实现非常好的效果。

为了获得这些好处，我们会使用 [semantic_text](https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text "semantic_text") 字段、文本分块和 语义高亮 。想了解最新的 semantic_text 特性，推荐阅读这篇文章。

我们会用 Streamlit 创建一个应用，把所有内容整合起来。它应该看起来像这样：

目标是提问后，从原始文档中获取回答问题的句子，并通过一个按钮查看该句子的上下文。为了提升用户体验，我们还会添加一些元数据，比如文章缩略图、标题和来源链接。这样，根据不同的文章，展示答案的卡片也会有所不同。

要求：

Elastic Serverless 实例。可以在这里开始试用
Python

关于文档结构，我们会索引 Wikipedia 页面，使每篇文章的每个部分成为一个 Elasticsearch 文档，然后每个文档再被分割成句子。这样，我们就能精准定位回答问题的句子，并能参考该句子的上下文。

以下是创建和测试应用的步骤：

配置 Inference endpoint
配置 mappings
上传文档
创建应用
最终测试

本文只包含主要模块，你可以在这里访问完整的代码仓库。

配置推理端点

导入依赖项

lua 复制代码

`

1.  from elasticsearch import Elasticsearch
2.  import os

4.  os.environ["ELASTIC_ENDPOINT"] = (
5.      "your_es_endpoint"
6.  )
7.  os.environ["ELASTIC_API_KEY"] = (
8.      "your_es_key"
9.  )

11.  es = Elasticsearch(
12.      os.environ["ELASTIC_ENDPOINT"],
13.      api_key=os.environ["ELASTIC_API_KEY"],
14.  )

16.  INDEX_NAME = "wikipedia"

`AI写代码

首先，我们将配置推理端点，在这里我们将定义 ELSER 作为我们的模型，并建立分块设置：

strategy：可以是 "sentence" 或 "word"。我们选择 "sentence" 以确保所有分块都有完整的句子，因此高亮会作用于短语而不是单词，使答案更流畅。
max_chunk_size：定义每个分块的最大单词数。
sentence_overlap：重叠的句子数。范围从 1 到 0。我们将其设置为 0，以提高高亮精度。当你想捕捉相邻内容时，建议使用 1。

你可以阅读这篇文章了解更多关于分块策略的内容。

python 复制代码

`

1.  es.options(request_timeout=60, max_retries=3, retry_on_timeout=True).inference.put(
2.      task_type="sparse_embedding",
3.      inference_id="wiki-inference",
4.      body={
5.          "service": "elasticsearch",
6.          "service_settings": {
7.              "adaptive_allocations": {"enabled": True},
8.              "num_threads": 1,
9.              "model_id": ".elser_model_2",
10.          },
11.          "chunking_settings": {
12.              "strategy": "sentence",
13.              "max_chunk_size": 25,
14.              "sentence_overlap": 0,
15.          },
16.      },
17.  )

`AI写代码

配置映射

我们将配置以下字段来定义我们的映射：

css 复制代码

`

1.  mapping = {
2.      "mappings": {
3.          "properties": {
4.              "title": {"type": "text"},
5.              "section_name": {"type": "text"},
6.              "content": {"type": "text", "copy_to": "semantic_content"},
7.              "wiki_link": {"type": "keyword"},
8.              "image_url": {"type": "keyword"},
9.              "section_order": {"type": "integer"},
10.              "semantic_content": {
11.                  "type": "semantic_text",
12.                  "inference_id": "wiki-inference",
13.              },
14.          }
15.      }
16.  }

18.  # Create the index with the mapping
19.  if es.indices.exists(index=INDEX_NAME):
20.      es.indices.delete(index=INDEX_NAME)

22.  es.indices.create(index=INDEX_NAME, body=mapping)

`AI写代码

确保将内容字段复制到我们的 semantic_content 字段中，以便进行语义搜索。

上传文档

我们将使用以下脚本上传来自关于 Lionel Messi 的 Wikipedia 页面上的文档。

ini 复制代码

`

1.  # Define article metadata
2.  title = "Lionel Messi"
3.  wiki_link = "https://en.wikipedia.org/wiki/Lionel_Messi"
4.  image_url = "https://upload.wikimedia.org/wikipedia/commons/b/b4/Lionel-Messi-Argentina-2022-FIFA-World-Cup_%28cropped%29.jpg"

6.  # Define sections as array of objects
7.  sections = [
8.      {
9.          "section_name": "Introduction",
10.          "content": """Lionel Andrés "Leo" Messi (Spanish pronunciation: [ljoˈnel anˈdɾes ˈmesi] ⓘ; born 24 June 1987) is an Argentine professional footballer who plays as a forward for and captains both Major League Soccer club Inter Miami and the Argentina national team. Widely regarded as one of the greatest players of all time, Messi set numerous records for individual accolades won throughout his professional footballing career such as eight Ballon d'Or awards and eight times being named the world's best player by FIFA. He is the most decorated player in the history of professional football having won 45 team trophies, including twelve Big Five league titles, four UEFA Champions Leagues, two Copa Américas, and one FIFA World Cup. Messi holds the records for most European Golden Shoes (6), most goals in a calendar year (91), most goals for a single club (672, with Barcelona), most goals (474), hat-tricks (36) and assists (192) in La Liga, most assists (18) and goal contributions (32) in the Copa América, most goal contributions (21) in the World Cup, most international appearances (191) and international goals (112) by a South American male, and the second-most in the latter category outright. A prolific goalscorer and creative playmaker, Messi has scored over 850 senior career goals and has provided over 380 assists for club and country.""",
11.      },
12.      {
13.          "section_name": "Early Career at Barcelona",
14.          "content": """Born in Rosario, Argentina, Messi relocated to Spain to join Barcelona at age 13, and made his competitive debut at age 17 in October 2004. He gradually established himself as an integral player for the club, and during his first uninterrupted season at age 22 in 2008--09 he helped Barcelona achieve the first treble in Spanish football. This resulted in Messi winning the first of four consecutive Ballons d'Or, and by the 2011--12 season he would set La Liga and European records for most goals in a season and establish himself as Barcelona's all-time top scorer. The following two seasons, he finished second for the Ballon d'Or behind Cristiano Ronaldo, his perceived career rival. However, he regained his best form during the 2014--15 campaign, where he became the all-time top scorer in La Liga, led Barcelona to a historic second treble, and won a fifth Ballon d'Or in 2015. He assumed Barcelona's captaincy in 2018 and won a record sixth Ballon d'Or in 2019. During his overall tenure at Barcelona, Messi won a club-record 34 trophies, including ten La Liga titles and four Champions Leagues, among others. Financial difficulties at Barcelona led to Messi signing with French club Paris Saint-Germain in August 2021, where he would win the Ligue 1 title during both of his seasons there. He joined Major League Soccer club Inter Miami in July 2023.""",
15.      },
16.      {
17.          "section_name": "International Career",
18.          "content": """An Argentine international, Messi is the national team's all-time leading goalscorer and most-capped player. His style of play as a diminutive, left-footed dribbler, drew career-long comparisons with compatriot Diego Maradona, who described Messi as his successor. At the youth level, he won the 2005 FIFA World Youth Championship and gold medal in the 2008 Summer Olympics. After his senior debut in 2005, Messi became the youngest Argentine to play and score in a World Cup in 2006. Assuming captaincy in 2011, he then led Argentina to three consecutive finals in the 2014 FIFA World Cup, the 2015 Copa América and the Copa América Centenario, all of which they would lose. After initially announcing his international retirement in 2016, he returned to help his country narrowly qualify for the 2018 FIFA World Cup, which they would exit early. Messi and the national team finally broke Argentina's 28-year trophy drought by winning the 2021 Copa América, which helped him secure his seventh Ballon d'Or that year. He then led Argentina to win the 2022 Finalissima, as well as the 2022 FIFA World Cup, his country's third overall world championship and first in 36 years. This followed with a record-extending eighth Ballon d'Or in 2023, and a victory in the 2024 Copa América.""",
19.      },
20.      # Add more sections as needed...
21.  ]

23.  # Load each section as a separate document
24.  for i, section in enumerate(sections):
25.      document = {
26.          "title": title,
27.          "section_name": section["section_name"],
28.          "content": section["content"],
29.          "wiki_link": wiki_link,
30.          "image_url": image_url,
31.          "section_order": i,
32.      }

34.      # Index the document
35.      es.index(index=INDEX_NAME, document=document)

37.  # Refresh the index to make documents searchable immediately
38.  es.indices.refresh(index=INDEX_NAME)

`AI写代码

创建应用

我们将创建一个应用，用户输入问题后，应用会在 Elasticsearch 中搜索最相关的句子，并使用高亮显示展示最相关的答案，同时显示该句子来自的章节。这样，用户可以正确阅读答案，然后通过类似 LLM 生成的答案中包含引文的方式深入了解。

安装依赖

arduino 复制代码

`pip install elasticsearch streamlit st-annotated-text`AI写代码

让我们首先创建一个运行问题语义查询的函数：

python 复制代码

`

1.  # es.py
2.  from elasticsearch import Elasticsearch
3.  import os

5.  os.environ["ELASTIC_ENDPOINT"] = (
6.      "your_serverless_endpoint"
7.  )
8.  os.environ["ELASTIC_API_KEY"] = (
9.      "your_search_key"
10.  )

12.  es = Elasticsearch(
13.      os.environ["ELASTIC_ENDPOINT"],
14.      api_key=os.environ["ELASTIC_API_KEY"],
15.  )

17.  INDEX_NAME = "wikipedia"

19.  # Ask function
20.  def ask(question):
21.      print("asking question")
22.      print(question)
23.      response = es.search(
24.          index=INDEX_NAME,
25.          body={
26.              "size": 1,
27.              "query": {"semantic": {"field": "semantic_content", "query": question}},
28.              "highlight": {"fields": {"semantic_content": {}}},
29.          },
30.      )

32.      print("Hits",response)

35.      hits = response["hits"]["hits"]

37.      if not hits:
38.          print("No hits found")
39.          return None

41.      answer = hits[0]["highlight"]["semantic_content"][0]
42.      section = hits[0]["_source"]

44.      return {"answer": answer, "section": section}

`AI写代码

在 ask 方法中，我们将返回第一个对应完整章节的文档作为完整上下文，并将来自高亮部分的第一个片段作为答案，按 _score 排序，也就是说，最相关的将是最优先的。

现在，我们将所有内容组合在一个 Streamlit 应用中。为了突出显示答案，我们将使用 annotated_text，这是一个组件，可以更容易地为高亮文本添加颜色并进行标记。

python 复制代码

`

1.  # ui.py
2.  import streamlit as st
3.  from es import ask
4.  from annotated_text import annotated_text

6.  def highlight_answer_in_section(section_text, answer):
7.      """Highlight the answer within the section text using annotated_text"""
8.      before, after = section_text.split(answer, 1)

10.      # Return the text with the answer annotated
11.      return annotated_text(
12.          before,
13.          (answer, "", "rgb(22 97 50)"),
14.          after
15.      )

18.  def main():
19.      st.title("Wikipedia Q&A System")
20.      question = st.text_input("Ask a question about Lionel Messi:")

22.      if question:
23.          try:
24.              # Get response from elasticsearch
25.              result = ask(question)

27.              if result and "section" in result:
28.                  section = result["section"]
29.                  answer = result["answer"]

31.                  # Display article metadata
32.                  col1, col2 = st.columns([1, 2])

34.                  with col1:
35.                      st.image(
36.                          section["image_url"],
37.                          caption=section["title"],
38.                          use_container_width=True,
39.                      )

41.                  with col2:
42.                      st.header(section["title"])
43.                      st.write(f"From section: {section['section_name']}")
44.                      st.write(f"[Read full article]({section['wiki_link']})")

46.                  # Display the answer
47.                  st.subheader("Answer:")
48.                  st.markdown(answer)

50.                  # Add toggle button for full context
51.                  on = st.toggle("Show context")

53.                  if on:
54.                      st.subheader("Full Context:")
55.                      highlight_answer_in_section(
56.                          section["content"], answer
57.                      )

59.              else:
60.                  st.error("Sorry, I couldn't find a relevant answer to your question.")

62.          except Exception as e:
63.              st.error(f"An error occurred: {str(e)}")
64.              st.error("Please try again with a different question.")

67.  if __name__ == "__main__":
68.      main()

`AI写代码

最终测试

要进行测试，我们只需要运行代码并提出我们的提问：

arduino 复制代码

`streamlit run ui.py`AI写代码

不错！答案的质量将取决于我们的数据以及问题与可用句子之间的相关性。

结论

通过良好的文档结构和优秀的语义搜索模型如 ELSER，能够构建一个无需 LLM 的问答体验。尽管它有一些局限性，但它是一个值得尝试的选项，帮助我们更好地理解数据，而不仅仅是将一切交给 LLM 并寄希望于它。

在本文中，我们展示了通过使用语义搜索、语义高亮和一些 Python 代码，如何在不使用 LLM 的情况下接近 RAG 系统的结果，而没有其缺点，如成本、延迟、隐私和管理等。

文档结构和用户界面中的用户体验等元素有助于弥补你从 LLM 合成答案中获得的 "人性化" 效果，专注于向量数据库找到回答问题的确切句子的能力。

一个可能的下一步是补充其他数据源来创建更丰富的体验，其中 Wikipedia 只是用于回答问题的多个数据源之一，类似于 Perplexity 的做法。

这样，我们可以创建一个应用，利用不同的平台提供你搜索的人或实体的 360° 视图。

你准备好试试吗？

原文：RAG without "AG"? - Elasticsearch Labs