Elasticsearch嵌套类型nested使用指南

Elasticsearch嵌套类型nested使用指南

嵌套类型nested的特点

  • 可以存储数组对象
  • 每个对象独立存储为隐藏的子文档

嵌套类型nested和object类型的区别和应用场景

  • nested

    • 每个对象独立存储为隐藏的子文档
    • 适合存储数组
    • 查询的时候有特定的语法 nested查询 且字段要用对象.字段名
    • 性能略低 因为要维护子文档
  • object

    • 扁平化存储,数组中的对象相同字段会合并成数组
    • 适合存储单层的json对象
    • 查询的时候指定对象.字段名称即可
    • 查询性能高

嵌套类型nested如何使用

索引结构

复制代码
"properties" : {
       "create_time" : {
         "format" : "yyyy-MM-dd HH:mm:ss Z||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd HH:mm:ss.SSS Z||yyyy-MM-dd HH:mm:ss.SSS||yyyy-MM-dd HH:mm:ss,SSS||yyyy/MM/dd HH:mm:ss||yyyy-MM-dd HH:mm:ss,SSS Z||yyyy/MM/dd HH:mm:ss,SSS Z||epoch_millis||yyyy-MM-dd",
         "index" : true,
         "type" : "date",
         "doc_values" : true
       },
       "title" : {
         "index" : true,
         "type" : "text"
       },
       "users" : {
         "type" : "nested",
         "properties" : {
           "sex" : {
             "type" : "keyword"
           },
           "age" : {
             "type" : "integer"
           },
           "username" : {
             "type" : "keyword"
           }
         }
       }
     }

插入数据

复制代码
POST cn_taoym_json_to_nested/_bulk
{"index":{}}
{"create_time":"2023-10-01 10:00:00","title":"第一条数据","users":[{"username":"user1","age":25,"sex":"male"}]}
{"index":{}}
{"create_time":"2023-10-02 14:30:00","title":"第二条数据","users":[{"username":"user2","age":30,"sex":"female"}]}
{"index":{}}
{"create_time":"2023-10-03 09:15:00","title":"第三条数据","users":[{"username":"user3","age":28,"sex":"male"},{"username":"user4","age":22,"sex":"female"}]}
{"index":{}}
{"create_time":"1696300800000","title":"第四条数据(时间戳格式)","users":[{"username":"user5","age":35,"sex":"male"}]}
{"index":{}}
{"create_time":"2023/10/05 16:45:00","title":"第五条数据(不同日期格式)","users":[{"username":"user6","age":27,"sex":"female"}]}

查询username为user3的数据

复制代码
GET cn_taoym_json_to_nested/_search
{
  "query": {
    "nested": {
      "path": "users",
      "query": {
        "term": {
          "users.username": {
            "value": "user3"
          }
        }
      },
      "inner_hits": {}
    }
  }
}

结果集

复制代码
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 4,
    "successful" : 4,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.9808291,
    "hits" : [ {
      "_index" : "cn_taoym_json_to_nested",
      "_type" : "_doc",
      "_id" : "EKfHp5gBCQbF-O0GPRhA",
      "_score" : 0.9808291,
      "_source" : {
        "create_time" : "2023-10-03 09:15:00",
        "title" : "第三条数据",
        "users" : [ {
          "sex" : "male",
          "age" : 28,
          "username" : "user3"
        }, {
          "sex" : "female",
          "age" : 22,
          "username" : "user4"
        } ]
      },
      "inner_hits" : {
        "users" : {
          "hits" : {
            "hits" : [ {
              "_index" : "cn_taoym_json_to_nested",
              "_type" : "_doc",
              "_source" : {
                "sex" : "male",
                "age" : 28,
                "username" : "user3"
              },
              "_id" : "EKfHp5gBCQbF-O0GPRhA",
              "_nested" : {
                "field" : "users",
                "offset" : 0
              },
              "_score" : 0.9808291
            } ],
            "total" : 1,
            "max_score" : 0.9808291
          }
        }
      }
    } ]
  }
}

"inner_hits": {}的作用是将嵌套查询中复合预期的数据单独收集起来。source里面存储的是原数据,里面自然包含所有数据的