Elasticsearch:智能搜索 - AI builder 及 skills

想象一下,我们如何搜索如下的一个问题:

css 复制代码
`Find a home within 10 miles of Miami, Florida that has 2 bedrooms, 2 bathrooms, central air, and tile floors, with a budget up to $300,000.`AI写代码

这类问题存在于很多的电子商务网站搜索中。它也是一种非常实用的搜索方式之一。那么要实现这样的搜索方式,我们有如下的几种方式来实现:

  1. 使用 Python 代码实现工具,并让 LLM 来进行调用。我们需要调用 LLM 来提取我们搜索的参数。为了精准搜索,我们可以使用 template 来下继续搜索。详细的情况,可以参考文章 "统一 Elastic 向量数据库与 LLM 功能,实现智能查询"
  2. 我们可以为这个搜索用 Python 创建一个定制的 MCP 服务器,然后在客户端里进行调用。我们可以参考文章 "Elasticsearch:智能搜索的 MCP"
  3. 我们使用 AI Builder 及 Workflow 来实现。在 workflow 里实现类似于在 DSL 中的模版搜索从而达到精确搜索的目的。详细的使用说明,请参考文章 "Elasticsearch:智能搜索 - AI Builder 及 Workflow"。
  4. 我们使用 AI Builder,Workflow 及 Skills 来共同完成。我们使用 geocoding workflow 来完成地理位置的获取,详细的实现请参阅文章 "Elasticsearch:智能搜索 - AI builder,workflow 及 skills"。

在如上的三种方案里,第三种方案的实现最为简捷,因为它不需要另外单独的编程。我们只需要在 Kibana 里创建 agent 及 workflow 来完成即可。维护起来也非常简单直接。那么我们有没有更为方便的方法呢?答案是肯定的。在即将推出的 Elastic Stack 9.4 中(目前在 Elastic Serverless cloud 中可用),我们可以使用 skill 来更进一步简化的目的。

在第四中方案里,我们也使用了 workflow 来完成 geocoding 的工作。我们是否可以直接省去这个环节,在 skill 里完成呢?答案是肯定的。我们下面来展示是如何完成的。

步骤一:写入数据

我们需要按照文章 "Elasticsearch:智能搜索的 MCP" 写入文档到 Elasticsearch 中。

步骤二:创建 property_search_skills

配置

  • ID:property_search_skills
  • Name:Property search skills
  • Description
arduino 复制代码
`

1.  - Invoke python script to do geocoding
2.  - Construct Elasticsearch using DSL search template.

`AI写代码
  • Instructions:
vbnet 复制代码
`

1.  - Used to do the geocoding based the queries
2.  - How to construct DSL search template to search the "properties" index

`AI写代码
  • Files
    • File name:property_search_skill

    • Folder path

      go 复制代码
      `./`AI写代码
    • Content

      vbnet 复制代码
      ``
      
      1.  ---
      2.  name: Property Search Skills
      3.  description: Skills related to property search functionality, including geocoding and location-based queries.
      4.  ---
      
      6.  # Property Search Skills
      7.  This document outlines the skills and tools used for implementing property search functionality, particularly focusing on geocoding and location-based queries.
      
      9.  ## Geocoding Tool
      10.  The geocoding tool is designed to convert addresses into geographic coordinates (latitude and longitude) using the Google Maps Geocoding API. This allows for location-based searches and queries in property search applications. The API to be used is the Google Maps Geocoding API, which requires an API key for authentication.
      
      12.  ### Environment Variables
      13.  To use the geocoding tool, the following environment variables need to be set:
      14.  - `GOOGLE_MAPS_API_KEY`: Your Google Maps API key for accessing the Geocoding API.  
      
      16.  **geocode_tool.py is the script that implements the geocoding functionality. It takes an address as input and returns the corresponding latitude and longitude coordinates. The results can be stored in Elasticsearch for further querying and analysis in property search applications.**
      
      18.  ```
      19.  import os
      20.  import sys
      21.  import json
      22.  import argparse
      23.  import requests
      
      25.  GEOCODE_URL = "https://maps.googleapis.com/maps/api/geocode/json"
      
      28.  def geocode(address: str, api_key: str) -> dict:
      29.      params = {
      30.          "address": address,
      31.          "key": api_key
      32.      }
      
      34.      resp = requests.get(GEOCODE_URL, params=params, timeout=10)
      35.      resp.raise_for_status()
      36.      data = resp.json()
      
      38.      if data.get("status") != "OK":
      39.          return {
      40.              "success": False,
      41.              "error": data.get("status"),
      42.              "raw": data
      43.          }
      
      45.      result = data["results"][0]
      46.      location = result["geometry"]["location"]
      
      48.      return {
      49.          "success": True,
      50.          "formatted_address": result["formatted_address"],
      51.          "location": {
      52.              "lat": location["lat"],
      53.              "lon": location["lng"]  # Google -> Elasticsearch format
      54.          },
      55.          "place_id": result.get("place_id"),
      56.          "types": result.get("types")
      57.      }
      
      60.  def main():
      61.      parser = argparse.ArgumentParser(description="Geocode an address")
      62.      parser.add_argument("--address", help="Address to geocode")
      
      64.      args = parser.parse_args()
      
      66.      try:
      67.          # Determine input source (stdin takes priority)
      68.          if not sys.stdin.isatty():
      69.              payload = json.load(sys.stdin)
      70.              address = payload.get("address")
      71.          else:
      72.              address = args.address
      
      74.          if not address:
      75.              raise ValueError("Missing address (provide via --address or stdin JSON)")
      
      77.          api_key = "<Your Google API Key>"
      
      79.          # api_key = os.environ.get("GOOGLE_API_KEY")
      80.          # if not api_key:
      81.          #     raise ValueError("Missing GOOGLE_API_KEY environment variable")
      
      83.          result = geocode(address, api_key)
      
      85.          print(json.dumps(result))
      86.          sys.exit(0 if result["success"] else 1)
      
      88.      except Exception as e:
      89.          print(json.dumps({
      90.              "success": False,
      91.              "error": str(e)
      92.          }))
      93.          sys.exit(1)
      
      96.  if __name__ == "__main__":
      97.      main()
      98.  ```
      
      100.  ### Usage
      101.  To run the geocoding tool, use the following command in your terminal:
      102.  ```bash
      103.  python geocode_tool.py --address "1600 Amphitheatre Parkway, Mountain View, CA"
      104.  ```
      
      106.  This command will geocode the provided address and return the corresponding latitude and longitude coordinates. The results can then be stored in Elasticsearch for further querying and analysis in property search applications.
      
      108.  # DSL search templates
      109.  In addition to geocoding, property search applications often require the ability to perform complex queries on the indexed property data. This can be achieved using Elasticsearch's Domain Specific Language (DSL) for searching and filtering data based on various criteria such as location, price range, property type, etc.
      
      111.  The details for implementing a search template can be found at https://www.elastic.co/docs/solutions/search/search-templates. 
      
      113.  For our search template, we need to use the following search template to do the DSL search:
      
      115.  {
      116.      "_source": false,
      117.      "size": 5,
      118.      "fields": ["title", "tax", "maintenance_fee", "bathrooms", "bedrooms", "square_footage", "home_price", "property_features"],
      119.      "retriever": {
      120.          "standard": {
      121.              "query": {
      122.                  "semantic": {
      123.                      "field": "body_content_semantic_text",
      124.                      "query": "{{query}}"
      125.                  }
      126.              },
      127.              "filter": {
      128.                  "bool": {
      129.                      "must": [
      130.                          {{#distance}}{
      131.                              "geo_distance": {
      132.                                  "distance": "{{distance}}",
      133.                                  "location": {
      134.                                      "lat": {{latitude}},
      135.                                      "lon": {{longitude}}
      136.                                  }
      137.                              }
      138.                          }{{/distance}}
      139.                          {{#bedrooms}}{{#distance}},{{/distance}}{
      140.                              "range": {
      141.                                  "bedrooms": {
      142.                                      "gte": {{bedrooms}}
      143.                                  }
      144.                              }
      145.                          }{{/bedrooms}}
      146.                          {{#bathrooms}}{{#distance}}{{^bedrooms}},{{/bedrooms}}{{/distance}}{{#bedrooms}},{{/bedrooms}}{
      147.                              "range": {
      148.                                  "bathrooms": {
      149.                                      "gte": {{bathrooms}}
      150.                                  }
      151.                              }
      152.                          }{{/bathrooms}}
      153.                          {{#tax}}{{#distance}}{{^bedrooms}}{{^bathrooms}},{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}},{{/bathrooms}}{{/bedrooms}}{{#bathrooms}},{{/bathrooms}}{
      154.                              "range": {
      155.                                  "tax": {
      156.                                      "lte": {{tax}}
      157.                                  }
      158.                              }
      159.                          }{{/tax}}
      160.                          {{#maintenance}}{{#distance}}{{^bedrooms}}{{^bathrooms}}{{^tax}},{{/tax}}{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}}{{^tax}},{{/tax}}{{/bathrooms}}{{/bedrooms}}{{#bathrooms}}{{^tax}},{{/tax}}{{/bathrooms}}{{#tax}},{{/tax}}{
      161.                              "range": {
      162.                                  "maintenance_fee": {
      163.                                      "lte": {{maintenance}}
      164.                                  }
      165.                              }
      166.                          }{{/maintenance}}
      167.                          {{#square_footage_max}}{{#distance}}{{^bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{#bathrooms}}{{^tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{/bathrooms}}{{#tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{#maintenance}},{{/maintenance}}{
      168.                              "range": {
      169.                                  "square_footage": {
      170.                                      "gte": {{#square_footage_min}}{{square_footage_min}}{{/square_footage_min}}{{^square_footage_min}}0{{/square_footage_min}},
      171.                                      "lte": {{square_footage_max}}
      172.                                  }
      173.                              }
      174.                          }{{/square_footage_max}}
      175.                          {{#home_price_max}}{{#distance}}{{^bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{#bathrooms}}{{^tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{/bathrooms}}{{#tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{#maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{#square_footage}},{{/square_footage}}{
      176.                              "range": {
      177.                                  "home_price": {
      178.                                      "gte": {{#home_price_min}}{{home_price_min}}{{/home_price_min}}{{^home_price_min}}0{{/home_price_min}},
      179.                                      "lte": {{home_price_max}}
      180.                                  }
      181.                              }
      182.                          }{{/home_price_max}}
      183.                          {{#feature}},{
      184.                              "bool": {
      185.                                  "should": [
      186.                                      {
      187.                                          "match": {
      188.                                              "property_features": {
      189.                                                  "query": "{{feature}}",
      190.                                                  "operator": "or"
      191.                                              }
      192.                                          }
      193.                                      }
      194.                                  ],
      195.                                  "minimum_should_match": 1
      196.                              }
      197.                          }{{/feature}}
      198.                      ]
      199.                  }
      200.              }
      201.          }
      202.      }
      203.  } 
      
      205.  We need to use "properties" index to do the search.  **please do see the range searches for bedrooms and bathrooms". We want to have bigger or equal matches. For the price, we need to have equal or smaller matches
      
      ``AI写代码![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)收起代码块![](https://csdnimg.cn/release/blogv2/dist/pc/img/arrowup-line-top-White.png)

:请在上面的代码中添加自己的 api_key = ""。

我们保存好上面的 skill 配置。

步骤三:创建 property_search_skill agent

我们按照如的步骤来创建 agent:

配置:

  • ID:property_search_skill

  • Custom Instructions

    vbnet 复制代码
    `
    
    1.  This agent is used to search for properties:
    
    3.  # Step 1:
    4.  You are an information extraction assistant.
    5.          Extract real estate search parameters from the user query.
    
    7.          Parameter descriptions:
    8.          - bathrooms: Number of bathrooms
    9.          - bedrooms: Number of bedrooms
    10.          - tax: Real estate tax amount
    11.          - maintenance: Maintenance fee amount
    12.          - square_footage_min: Minimum property square footage. If only a max square footage is provided, set this to 0. Otherwise set this to the minimum square footage specified by the user.
    13.          - square_footage_max: Maximum property square footage
    14.          - home_price_min: Minimum home price. If only a max home price is provided, set this to 0. Otherwise set this to the minimum home price specified by the user.
    15.          - home_price_max: Maximum home price
    16.          - property_features: Home features such as AC, pool, updated kitchens, etc should be listed as a single string.
    17.          - location: City, state, or full address if present.
    
    19.          Rules:
    20.          - Only include parameters explicitly mentioned.
    21.          - property_features must be a single space-separated string.
    22.          - Return ONLY a JSON object (not a string, no quotes, no extra text, no explanations).
    23.          - Do not include explanations.
    
    25.          Example JSON:
    26.          {
    27.            "query": "Find a home within 10 miles of Miami, Florida that has 2 bedrooms, 2 bathrooms, central air, and tile floors, with a budget up to $300,000."
    28.            "bathrooms": 2,
    29.            "bedrooms": 2,
    30.            "home_price_min": 0,
    31.            "home_price_max": 300000,
    32.            "property_features": "central air tile floors",
    33.            "location": "Miami, Florida"
    34.          }
    
    36.  # Step 2:
    37.  - Use the above constructed JSON format, and do a DSL template search. If you need to convert it to ES|QL queries, please do follow exactly the DSL template search ranges:
    38.  1. bathrooms is bigger or equal to the extracted one
    39.  2. bedrooms is bigger or equal to the extracted one
    40.  3. home_price is smaller or equal to the extracted one (home_price_max)
    
    42.  - Before you do the searches, please DO refer to the requirements specified by the property search skills/property_search_skill.md.
    
    44.  - Please print out the search template used for search, and then print out the top **4 results** for viewing.
    
    `AI写代码javascript运行![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)
  • Display name:Property search skills

  • Display description:Search for property

添加 skills:

我们把创建的 skill 添加到我们创建的 agent 中。

测试

我们还是按照之前的测试用例来进行测试:

css 复制代码
`Find a home within 10 miles of Miami, Florida that has 2 bedrooms, 2 bathrooms, central air, and tile floors, with a budget up to $300,000.`AI写代码

我们再以第二个例子来做展示:

css 复制代码
`Find a home within 10 miles of DeBary, Florida with 5 bedrooms, at least 2 bathrooms, central air, and a garage, with a budget up to $600,000.`AI写代码javascript运行

结论

在这个例子里,我们看到了 skill 的强大之处。我们甚至省去了繁琐的代码及 workflow 的创建。我们只使用 skill 即可。这些 skill 只有在需要的时候才会装载。非常省内存。

祝大家学习愉快!

相关推荐
Elastic 中国社区官方博客5 小时前
Elasticsearch 多年来的演进 —— LogsDB 如何在不影响吞吐量的情况下将索引大小减少高达 75%
大数据·运维·elasticsearch·搜索引擎·全文检索·可用性测试
摇滚侠5 小时前
创建 git 忽略文件 忽略 .obsidian 这个目录
大数据·git·elasticsearch
aq55356006 小时前
Laravel7.x十大革新特性详解
大数据·elasticsearch·mfc
aq55356006 小时前
Laravel8.x新特性全解析
c++·elasticsearch·mfc
keyipatience7 小时前
11.Git版本控制:从入门到精通
大数据·linux·elasticsearch·搜索引擎
linux修理工7 小时前
初始化 Git 仓库并推送到远程
大数据·elasticsearch·搜索引擎
@土豆1 天前
Elasticsearch 9.0.1 集群部署(Docker Compose + k8s 部署方式)
大数据·elasticsearch·docker
喝醉酒的小白1 天前
Elasticsearch 故障分析笔记:Pending Tasks 堆积与 Alias 风暴
笔记·elasticsearch
醉颜凉1 天前
Elasticsearch 生产级核心原理:Shard Allocation Awareness 工作机制与实战配置详解
大数据·elasticsearch·搜索引擎