【搜索引擎】Elasticsearch(一):索引创建、数据插入、请求示例

目录

    • 一、索引设计
      • [1. 用户索引 `user`](#1. 用户索引 user)
      • [2. 视频索引 `video`](#2. 视频索引 video)
      • [3. 直播间索引 `live_room`](#3. 直播间索引 live_room)
    • 二、测试数据构造
    • 三、搜索示例及预期返回结构
      • [1. 用户搜索:按粉丝数降序](#1. 用户搜索:按粉丝数降序)
      • [2. 用户搜索:按财富等级升序](#2. 用户搜索:按财富等级升序)
      • [3. 视频搜索:只返回精选视频,按点赞量降序](#3. 视频搜索:只返回精选视频,按点赞量降序)
      • [4. 视频搜索:正文匹配 + 精选优先 + 点赞量排序](#4. 视频搜索:正文匹配 + 精选优先 + 点赞量排序)
      • [5. 直播间搜索:标题匹配 "Tai Chi",按主播粉丝数降序](#5. 直播间搜索:标题匹配 "Tai Chi",按主播粉丝数降序)
      • [6. 直播间搜索:主播昵称匹配 "Zhang",按直播等级升序](#6. 直播间搜索:主播昵称匹配 "Zhang",按直播等级升序)
      • [7. 综合排序示例:用户按粉丝数降序 + 财富等级降序(多级排序)](#7. 综合排序示例:用户按粉丝数降序 + 财富等级降序(多级排序))
    • 四、清理索引

一、索引设计

1. 用户索引 user

新增字段:

  • fans_count (long):粉丝数
  • wealth_level (integer):财富等级(1-100)
  • live_level (integer):直播等级(1-100)
bash 复制代码
curl -s -u "$AUTH" -X PUT "$ES/user" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "analysis": {
      "analyzer": {
        "nickname_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "uid": { "type": "keyword" },
      "nickname": {
        "type": "text",
        "analyzer": "nickname_analyzer",
        "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }
      },
      "avatar": { "type": "keyword", "index": false },
      "fans_count": { "type": "long" },
      "wealth_level": { "type": "integer" },
      "live_level": { "type": "integer" },
      "created_at": { "type": "date" }
    }
  }
}
' | jq .

2. 视频索引 video

新增字段:

  • is_featured (boolean):是否精选
  • like_count (long):历史点赞量(已有,确保为 long)
bash 复制代码
curl -s -u "$AUTH" -X PUT "$ES/video" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "analysis": {
      "analyzer": {
        "content_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "video_id": { "type": "keyword" },
      "title": { "type": "text", "analyzer": "content_analyzer" },
      "content": { "type": "text", "analyzer": "content_analyzer" },
      "author_uid": { "type": "keyword" },
      "duration": { "type": "integer" },
      "is_featured": { "type": "boolean" },
      "like_count": { "type": "long" },
      "publish_time": { "type": "date" }
    }
  }
}
' | jq .

3. 直播间索引 live_room

新增字段:

  • anchor_fans_count (long):主播粉丝数(冗余存储,便于排序)
  • anchor_wealth_level (integer):主播财富等级
  • anchor_live_level (integer):主播直播等级
bash 复制代码
curl -s -u "$AUTH" -X PUT "$ES/live_room" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "analysis": {
      "analyzer": {
        "title_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "room_id": { "type": "keyword" },
      "title": {
        "type": "text",
        "analyzer": "title_analyzer",
        "fields": { "keyword": { "type": "keyword", "ignore_above": 512 } }
      },
      "anchor_uid": { "type": "keyword" },
      "anchor_nickname": {
        "type": "text",
        "analyzer": "title_analyzer",
        "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }
      },
      "status": { "type": "keyword" },
      "viewer_count": { "type": "integer" },
      "anchor_fans_count": { "type": "long" },
      "anchor_wealth_level": { "type": "integer" },
      "anchor_live_level": { "type": "integer" },
      "start_time": { "type": "date" }
    }
  }
}
' | jq .

二、测试数据构造

用户数据(含粉丝数、等级)

bash 复制代码
curl -s -u "$AUTH" -X POST "$ES/user/_doc" -H 'Content-Type: application/json' -d'
{"uid": "1001", "nickname": "Zhang Sanfeng", "avatar": "avatar1.jpg", "fans_count": 1500000, "wealth_level": 80, "live_level": 75, "created_at": "2024-01-01T00:00:00Z"}
' | jq .

curl -s -u "$AUTH" -X POST "$ES/user/_doc" -H 'Content-Type: application/json' -d'
{"uid": "1002", "nickname": "Zhang Wuji", "avatar": "avatar2.jpg", "fans_count": 980000, "wealth_level": 92, "live_level": 88, "created_at": "2024-01-02T00:00:00Z"}
' | jq .

curl -s -u "$AUTH" -X POST "$ES/user/_doc" -H 'Content-Type: application/json' -d'
{"uid": "1003", "nickname": "Linghu Chong", "avatar": "avatar3.jpg", "fans_count": 320000, "wealth_level": 65, "live_level": 70, "created_at": "2024-01-03T00:00:00Z"}
' | jq .

视频数据(含是否精选、点赞量)

bash 复制代码
curl -s -u "$AUTH" -X POST "$ES/video/_doc" -H 'Content-Type: application/json' -d'
{
  "video_id": "v001",
  "title": "Tai Chi Chuan Tutorial",
  "content": "Tai Chi is a traditional Chinese martial art that combines slow movements and deep breathing.",
  "author_uid": "1001",
  "duration": 600,
  "is_featured": true,
  "like_count": 125000,
  "publish_time": "2024-02-01T10:00:00Z"
}
' | jq .

curl -s -u "$AUTH" -X POST "$ES/video/_doc" -H 'Content-Type: application/json' -d'
{
  "video_id": "v002",
  "title": "Python Programming for Beginners",
  "content": "Learn Python basics: variables, loops, functions, and data structures.",
  "author_uid": "1002",
  "duration": 1800,
  "is_featured": false,
  "like_count": 89000,
  "publish_time": "2024-02-10T14:30:00Z"
}
' | jq .

curl -s -u "$AUTH" -X POST "$ES/video/_doc" -H 'Content-Type: application/json' -d'
{
  "video_id": "v003",
  "title": "Cooking Braised Pork Belly",
  "content": "Braised pork belly is a classic Chinese dish made with pork belly, sugar, and soy sauce.",
  "author_uid": "1003",
  "duration": 900,
  "is_featured": true,
  "like_count": 234000,
  "publish_time": "2024-02-15T18:00:00Z"
}
' | jq .

直播间数据(含主播粉丝数、等级)

bash 复制代码
curl -s -u "$AUTH" -X POST "$ES/live_room/_doc" -H 'Content-Type: application/json' -d'
{
  "room_id": "r001",
  "title": "Tai Chi Class with Zhang Sanfeng",
  "anchor_uid": "1001",
  "anchor_nickname": "Zhang Sanfeng",
  "status": "live",
  "viewer_count": 1234,
  "anchor_fans_count": 1500000,
  "anchor_wealth_level": 80,
  "anchor_live_level": 75,
  "start_time": "2024-03-01T20:00:00Z"
}
' | jq .

curl -s -u "$AUTH" -X POST "$ES/live_room/_doc" -H 'Content-Type: application/json' -d'
{
  "room_id": "r002",
  "title": "Qian Kun Great Move by Zhang Wuji",
  "anchor_uid": "1002",
  "anchor_nickname": "Zhang Wuji",
  "status": "ended",
  "viewer_count": 5678,
  "anchor_fans_count": 980000,
  "anchor_wealth_level": 92,
  "anchor_live_level": 88,
  "start_time": "2024-03-02T19:00:00Z"
}
' | jq .

curl -s -u "$AUTH" -X POST "$ES/live_room/_doc" -H 'Content-Type: application/json' -d'
{
  "room_id": "r003",
  "title": "Linghu Chong Playing Guqin",
  "anchor_uid": "1003",
  "anchor_nickname": "Linghu Chong",
  "status": "live",
  "viewer_count": 890,
  "anchor_fans_count": 320000,
  "anchor_wealth_level": 65,
  "anchor_live_level": 70,
  "start_time": "2024-03-03T21:30:00Z"
}
' | jq .

三、搜索示例及预期返回结构

1. 用户搜索:按粉丝数降序

请求:搜索昵称包含 "Zhang" 的用户,按粉丝数从高到低排序

bash 复制代码
curl -s -u "$AUTH" "$ES/user/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match": { "nickname": "Zhang" } },
  "sort": [
    { "fans_count": { "order": "desc" } }
  ]
}
' | jq .

预期返回结构hits 部分):

json 复制代码
{
  "hits": {
    "total": { "value": 2, "relation": "eq" },
    "max_score": null,
    "hits": [
      {
        "_index": "user",
        "_source": { "uid": "1001", "nickname": "Zhang Sanfeng", "fans_count": 1500000, ... },
        "sort": [1500000]
      },
      {
        "_index": "user",
        "_source": { "uid": "1002", "nickname": "Zhang Wuji", "fans_count": 980000, ... },
        "sort": [980000]
      }
    ]
  }
}

2. 用户搜索:按财富等级升序

bash 复制代码
curl -s -u "$AUTH" "$ES/user/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "sort": [
    { "wealth_level": { "order": "asc" } }
  ]
}
' | jq .

3. 视频搜索:只返回精选视频,按点赞量降序

bash 复制代码
curl -s -u "$AUTH" "$ES/video/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "term": { "is_featured": true } },
  "sort": [
    { "like_count": { "order": "desc" } }
  ]
}
' | jq .

预期返回:v003 (like 234000) 在前,v001 (125000) 在后。

4. 视频搜索:正文匹配 + 精选优先 + 点赞量排序

bash 复制代码
curl -s -u "$AUTH" "$ES/video/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": { "match": { "content": "Chinese" } },
      "filter": { "term": { "is_featured": true } }
    }
  },
  "sort": [
    { "like_count": { "order": "desc" } }
  ]
}
' | jq .

5. 直播间搜索:标题匹配 "Tai Chi",按主播粉丝数降序

bash 复制代码
curl -s -u "$AUTH" "$ES/live_room/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match": { "title": "Tai Chi" } },
  "sort": [
    { "anchor_fans_count": { "order": "desc" } }
  ]
}
' | jq .

6. 直播间搜索:主播昵称匹配 "Zhang",按直播等级升序

bash 复制代码
curl -s -u "$AUTH" "$ES/live_room/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match": { "anchor_nickname": "Zhang" } },
  "sort": [
    { "anchor_live_level": { "order": "asc" } }
  ]
}
' | jq .

7. 综合排序示例:用户按粉丝数降序 + 财富等级降序(多级排序)

bash 复制代码
curl -s -u "$AUTH" "$ES/user/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "sort": [
    { "fans_count": { "order": "desc" } },
    { "wealth_level": { "order": "desc" } }
  ]
}
' | jq .

四、清理索引

bash 复制代码
curl -s -u "$AUTH" -X DELETE "$ES/user" | jq .
curl -s -u "$AUTH" -X DELETE "$ES/video" | jq .
curl -s -u "$AUTH" -X DELETE "$ES/live_room" | jq .

配方:

需求:我要设计一个可以通过"(1)用户uid、用户昵称搜用户;(2)视频正文搜视频,(3)直播间标题和主播uid、主播昵称搜直播间。""用户和直播间支持粉丝数、主播财富等级、主播直播等级排序,视频支持是否精选,历史点赞量排序"的搜索引擎,

问:怎么设计索引

PS:以及给我构造一些测试数据,给出curl,并给出预期返回结构(注意使用内置分词器)


相关推荐
Rabbit_QL3 小时前
【激活函数】Sigmoid 与 Softmax 的关系:从二分类到多分类的统一视角
人工智能·深度学习·分类
童话名剑3 小时前
YOLOv9 (学习笔记)
人工智能·深度学习
爱分享的阿Q3 小时前
从AI IDE到Agent统一工作区:开发环境的范式跃迁
ide·人工智能
互联网科技看点3 小时前
当AI+遇上产业:掌动智能如何打造智能化时代的“新基建”
人工智能·百度
达子6663 小时前
Git中文文件名乱码显示SHA-1 哈希值
git·算法·哈希算法
VkN2X2X4b3 小时前
算法性能测试的统计建模与误差估计的技术9
算法
世优科技虚拟人3 小时前
数字人“闯”进景区:从IP复活到VR沉浸体验,文旅玩法正被重塑
人工智能·vr·数字人·智慧文旅·ai数字人·大屏数字人
学技术的大胜嗷3 小时前
YOLO 训练报错排查:解决 ultralytics 同名包冲突(本地源码 与 环境中的包)
人工智能·深度学习·机器学习
2301_764441333 小时前
计算机视觉:城市公共空间多主体行为计算
人工智能·计算机视觉·目标跟踪