【橘子ES】Aggregations 聚合准备

一、聚合的概念

聚合文档

聚合区别于检索,检索是使用一系列条件把文档从es中搜索回来。但是聚合则是在搜索回来的文档的基础上进一步进行处理。

简单来说聚合就是将数据汇总为指标、统计数据或其他分析。聚合可以解决以下几类问题:

  • 我的网站的平均加载时间是多少?
  • 根据交易量,谁是我最有价值的客户?
  • 在我的网络上,什么会被视为大文件?
  • 每个商品分类有多少件商品?

基本上我们可以看出来,他是一种聚合分析,类似于做报表那样的一个功能。既然是报表分析的话那就离不开一些常见的概念,什么平均值,最大值,什么按照什么分组之后统计每个组里面的数据量这样的功能。在es中支持了三种聚合来实现这些功能。

  • Metric aggregations:指标聚合是根据字段值计算量度(如总和或平均值)的量度聚合。
  • Bucket aggregations:分桶聚合是根据字段值、范围或其他条件将文档分组到存储桶(也称为箱)中。其实你可以对应理解为mysql中的count ... group by field1这种。
  • Pipeline aggregations:管道聚合是从其他聚合而不是文档或字段获取输入的管道聚合。这个稍微比上面两个难一点,具体来说就是上面两种的聚合都是把数据检索出来进行分析之类的。但是这个不是直接获取数据分析,他是在上面两个分析之后的结果的基础上进一步分析。他是建立在聚合之上的聚合。
    下面我们就来逐一分析三种聚合的使用,不过在此之前,我们先来构建我们的数据。我们构建的索引是一个衣服的索引,包括分类,名称,价格,品牌,描述,产地这几个字段,并且生成20条数据,这个生成数据直接交给llm即可。比如这样。
json 复制代码
PUT /clothes
{
  "settings": {
    "index": {
      "number_of_shards": 1,  
      "number_of_replicas": 0
    }
  },
  "mappings": {
    "properties": {
      "category":{
        "type": "keyword"
      },
      "name":{
        "type": "keyword",
        "fields": {
          "name_text":{
            "type": "keyword"
          }
        }
      },
      "price":{
        "type": "double"
      },
      "brand":{
        "type": "keyword"
      },
      "desc":{
        "type": "text"
      },
      "place_of_origin":{
        "type": "keyword"
      }
    }
  }
}

POST _bulk
{ "index" : { "_index" : "clothes", "_id" : "1" } }
{ "category" : "T-shirt", "name" : "纯棉T恤", "price" : 19.99, "brand" : "品牌A", "desc" : "基础款纯棉T恤,适合日常穿着。", "place_of_origin" : "中国" }
{ "index" : { "_index" : "clothes", "_id" : "2" } }
{ "category" : "Jeans", "name" : "修身牛仔裤", "price" : 49.99, "brand" : "品牌B", "desc" : "耐穿的牛仔裤,修身款式。", "place_of_origin" : "越南" }
{ "index" : { "_index" : "clothes", "_id" : "3" } }
{ "category" : "Dress", "name" : "晚礼服", "price" : 89.99, "brand" : "品牌C", "desc" : "适合特殊场合的优雅晚礼服。", "place_of_origin" : "意大利" }
{ "index" : { "_index" : "clothes", "_id" : "4" } }
{ "category" : "Jacket", "name" : "皮夹克", "price" : 129.99, "brand" : "品牌D", "desc" : "时尚的男士皮夹克。", "place_of_origin" : "美国" }
{ "index" : { "_index" : "clothes", "_id" : "5" } }
{ "category" : "Sweater", "name" : "羊毛衫", "price" : 39.99, "brand" : "品牌E", "desc" : "适合冬季的保暖羊毛衫。", "place_of_origin" : "澳大利亚" }
{ "index" : { "_index" : "clothes", "_id" : "6" } }
{ "category" : "Skirt", "name" : "铅笔裙", "price" : 29.99, "brand" : "品牌F", "desc" : "适合办公室穿着的经典铅笔裙。", "place_of_origin" : "英国" }
{ "index" : { "_index" : "clothes", "_id" : "7" } }
{ "category" : "Shorts", "name" : "休闲短裤", "price" : 14.99, "brand" : "品牌G", "desc" : "适合夏天的舒适休闲短裤。", "place_of_origin" : "中国" }
{ "index" : { "_index" : "clothes", "_id" : "8" } }
{ "category" : "Blouse", "name" : "丝绸衬衫", "price" : 59.99, "brand" : "品牌H", "desc" : "柔软的丝绸衬衫,适合女性。", "place_of_origin" : "法国" }
{ "index" : { "_index" : "clothes", "_id" : "9" } }
{ "category" : "Coat", "name" : "冬季大衣", "price" : 199.99, "brand" : "品牌I", "desc" : "适合寒冷天气的厚冬季大衣。", "place_of_origin" : "加拿大" }
{ "index" : { "_index" : "clothes", "_id" : "10" } }
{ "category" : "Socks", "name" : "棉袜", "price" : 4.99, "brand" : "品牌J", "desc" : "一包舒适的棉袜。", "place_of_origin" : "中国" }
{ "index" : { "_index" : "clothes", "_id" : "11" } }
{ "category" : "T-shirt", "name" : "印花T恤", "price" : 24.99, "brand" : "品牌K", "desc" : "带有酷炫图案的T恤。", "place_of_origin" : "日本" }
{ "index" : { "_index" : "clothes", "_id" : "12" } }
{ "category" : "Jeans", "name" : "破洞牛仔裤", "price" : 59.99, "brand" : "品牌L", "desc" : "带有时尚破洞的牛仔裤。", "place_of_origin" : "美国" }
{ "index" : { "_index" : "clothes", "_id" : "13" } }
{ "category" : "Dress", "name" : "休闲连衣裙", "price" : 79.99, "brand" : "品牌M", "desc" : "适合日常穿着的舒适连衣裙。", "place_of_origin" : "中国" }
{ "index" : { "_index" : "clothes", "_id" : "14" } }
{ "category" : "Jacket", "name" : "风衣", "price" : 69.99, "brand" : "品牌N", "desc" : "轻便的风衣夹克。", "place_of_origin" : "德国" }
{ "index" : { "_index" : "clothes", "_id" : "15" } }
{ "category" : "Sweater", "name" : "针织毛衣", "price" : 44.99, "brand" : "品牌O", "desc" : "手工编织的毛衣。", "place_of_origin" : "英国" }
{ "index" : { "_index" : "clothes", "_id" : "16" } }
{ "category" : "Skirt", "name" : "百褶裙", "price" : 34.99, "brand" : "品牌P", "desc" : "时尚的百褶裙。", "place_of_origin" : "中国" }
{ "index" : { "_index" : "clothes", "_id" : "17" } }
{ "category" : "Shorts", "name" : "牛仔短裤", "price" : 19.99, "brand" : "品牌Q", "desc" : "适合休闲的牛仔短裤。", "place_of_origin" : "美国" }
{ "index" : { "_index" : "clothes", "_id" : "18" } }
{ "category" : "Blouse", "name" : "亚麻衬衫", "price" : 54.99, "brand" : "品牌R", "desc" : "适合夏天的轻薄亚麻衬衫。", "place_of_origin" : "意大利" }
{ "index" : { "_index" : "clothes", "_id" : "19" } }
{ "category" : "Coat", "name" : "风衣", "price" : 149.99, "brand" : "品牌S", "desc" : "经典的风衣。", "place_of_origin" : "英国" }
{ "index" : { "_index" : "clothes", "_id" : "20" } }
{ "category" : "Socks", "name" : "羊毛袜", "price" : 7.99, "brand" : "品牌T", "desc" : "适合冬季的厚羊毛袜。", "place_of_origin" : "澳大利亚" }

此时我们就构建好我们的数据了,后面我们再根据需要做修改等等操作。

好了,此时我们就准备好了,下面我们来进行操作。

相关推荐
fruge4 小时前
git上传 项目 把node_modules也上传至仓库了,在文件.gitignore 中忽略node_modules 依然不行
大数据·git·elasticsearch
飞火流星020278 小时前
ElasticSearch公共方法封装
elasticsearch·搜索引擎·es鉴权·es代理访问·es公共方法封装·es集群访问·判断es索引是否存在
vvvae12348 小时前
Elasticsearch实战应用:从“搜索小白”到“数据侦探”的进阶之路
elasticsearch
yinbp8 小时前
bboss v7.3.5来袭!新增异地灾备机制和Kerberos认证机制,助力企业数据安全
大数据·elasticsearch·微服务·etl·restclient·bboss
m0_748255029 小时前
Springboot中使用Elasticsearch(部署+使用+讲解 最完整)
spring boot·elasticsearch·jenkins
Elastic 中国社区官方博客9 小时前
Elasticsearch 自动补全搜索 - autocomplete
大数据·数据库·elasticsearch·搜索引擎·全文检索
Elastic 中国社区官方博客14 小时前
Elasticsearch 混合搜索 - Hybrid Search
大数据·人工智能·elasticsearch·搜索引擎·ai·语言模型·全文检索
KimiKudo14 小时前
记录一个ES分词器不生效的解决过程
elasticsearch
{⌐■_■}14 小时前
【git】工作场景下的 工作区 <-> 暂存区<-> 本地仓库 命令实战 具体案例
大数据·git·elasticsearch·golang·iphone·ip·etcd
risc12345614 小时前
【Elasticsearch】为一个字段配置多个分析器
elasticsearch