ElasticSearch04-高级操作
1、文档添加
(1)生成文档ID
- 不指定 ID,即自动生成ID,ID 是一行数据的唯一键。
- 语法:POST /index/_doc
json
复制代码
# 创建索引
PUT testid
# 默认情况下自动生成ID
POST /testid/_doc
{
"test_field": "test"
}
# 返回
{
"_index" : "testid",
"_type" : "_doc",
"_id" : "pyxXs5MBW--orQEFRZhH",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
- 场景:如果导入 ES 的数据想使用外部定义的 ID,需要在添加文档数据的时候指定 ID。
- 语法:POST /index/_doc/id
json
复制代码
# 指定ID
POST /testid/_doc/1
{
"test_field": "testid"
}
# 返回 1是指定的ID
{
"_index" : "testid",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1
}
2、文档条件查询
(1)准备数据
json
复制代码
# 创建索引
PUT teacher
POST /teacher/_doc/001
{
"name":"zhangsan",
"nickname":"zhangsan",
"sex":"男",
"age":30
}
POST /teacher/_doc/002
{
"name":"lisi",
"nickname":"lisi",
"sex":"男",
"age":20
}
POST /teacher/_doc/003
{
"name":"wangwu1",
"nickname":"wangwu2",
"sex":"女",
"age":40
}
POST /teacher/_doc/004
{
"name":"zhaoliu1",
"nickname":"zhaoliu2",
"sex":"女",
"age":50
}
POST /teacher/_doc/005
{
"name":"zhangsan2",
"nickname":"zhangsan2",
"sex":"女",
"age":30
}
POST /teacher/_doc/006
{
"name":"zhangsan222",
"nickname":"zhangsan222",
"sex":"女",
"age":30
}
(2)查询全部(match_all)
- ES中提供了一种强大的检索数据方式,这种检索方式称之为Query DSL,Query DSL是利用Rest API传递JSON格式的请求体与ES进行交互,这种方式的丰富查询语法让ES检索变得更强大,更简洁。
- 使用_search 表示查询
json
复制代码
# 请求
Get /teacher/_search
{
"query": {
"match_all": {}
}
}
# 返回
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "002",
"_score" : 1.0,
"_source" : {
"name" : "lisi",
"nickname" : "lisi",
"sex" : "男",
"age" : 20
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "003",
"_score" : 1.0,
"_source" : {
"name" : "wangwu1",
"nickname" : "wangwu2",
"sex" : "女",
"age" : 40
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "004",
"_score" : 1.0,
"_source" : {
"name" : "zhaoliu1",
"nickname" : "zhaoliu2",
"sex" : "女",
"age" : 50
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "005",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan2",
"nickname" : "zhangsan2",
"sex" : "女",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "006",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan222",
"nickname" : "zhangsan222",
"sex" : "女",
"age" : 30
}
}
]
}
}
- 返回的结果说明:
took
:这个字段表示查询执行所花费的毫秒数。在这个例子中,查询执行花费了0毫秒。
timed_out
:这个字段表示查询是否超时。false
表示查询没有超时。
_shards
:这个字段包含了关于分片的信息。
total
:查询被分发到的分片总数。
successful
:成功处理的分片数。
skipped
:被跳过的分片数。
failed
:处理失败的分片数。
hits
:这个字段包含了查询结果的相关信息。
total
:匹配的文档总数。value
表示总数,relation
表示总数的类型,这里是eq
,表示精确匹配。
max_score
:所有匹配文档中最高的相关性得分,这里是1.0。
hits
:实际匹配的文档列表,每个文档包含以下信息:
_index
:文档所在的索引名称。
_type
:文档的类型(在Elasticsearch 7.x之后,类型被标记为废弃)。
_id
:文档的唯一标识符。
_score
:文档的相关性得分。
_source
:文档的原始数据,包含了文档的所有字段和值。
- 在
hits
数组中,每个元素都是一个文档,包含了文档的元数据(如索引名、类型、ID、得分)和文档的内容(在_source
字段中)。这个例子中返回了6个文档,每个文档都有name
、nickname
、sex
和age
字段。
(3)分词查询(match)
match
查询用于在单个字段上执行全文检索。它会使用指定字段的分词器(analyzer)来处理查询文本,然后搜索与这些词元匹配的文档。
- 只有 text 类型的字段才会进行分词查询。
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"match": {
"name": "zhangsan"
}
}
}
# 返回
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.540445,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.540445,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
}
]
}
}
(4)精确查询(term)
term
查询用于精确匹配,不进行分词处理,适用于关键字(keyword)、数字、日期等精确值字段。
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"term": {
"name": {
"value": "zhangsan"
}
}
}
}
# 返回
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.540445,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.540445,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
}
]
}
}
(5)范围查询(range)
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"range": {
"age": {
"gte": 30,
"lte": 30
}
}
}
}
# 返回
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "005",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan2",
"nickname" : "zhangsan2",
"sex" : "女",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "006",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan222",
"nickname" : "zhangsan222",
"sex" : "女",
"age" : 30
}
}
]
}
}
(6)前缀查询(prefix)
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"prefix": {
"name": {
"value": "zhangsan"
}
}
}
}
# 返回
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "005",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan2",
"nickname" : "zhangsan2",
"sex" : "女",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "006",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan222",
"nickname" : "zhangsan222",
"sex" : "女",
"age" : 30
}
}
]
}
}
(7)通配符查询(wildcard)
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"wildcard": {
"name": {
"value": "*san*"
}
}
}
}
# 返回
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "005",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan2",
"nickname" : "zhangsan2",
"sex" : "女",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "006",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan222",
"nickname" : "zhangsan222",
"sex" : "女",
"age" : 30
}
}
]
}
}
(8)多id查询(ids)
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"ids": {
"values": ["001","003","005"]
}
}
}
#返回
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "003",
"_score" : 1.0,
"_source" : {
"name" : "wangwu1",
"nickname" : "wangwu2",
"sex" : "女",
"age" : 40
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "005",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan2",
"nickname" : "zhangsan2",
"sex" : "女",
"age" : 30
}
}
]
}
}
(9)模糊查询(fuzzy)
- Elasticsearch中的
fuzzy
查询允许用户搜索包含与指定查询词相似的文档,即使它们不完全相同。这种查询特别适用于处理近似或不精确的搜索词,例如用户可能打错字或输入同一词的不同变体的情况。
- 以下是
fuzzy
查询的一些关键点:
- 编辑距离(Edit Distance) :
fuzzy
查询使用Levenshtein编辑距离来衡量搜索词之间的相似度。编辑距离是指将一个词转换成另一个词所需的单字符更改次数,包括改变一个字符、删除一个字符、插入一个字符或交换两个相邻字符。
- 创建变体(Creating Variations) :
fuzzy
查询在指定的编辑距离内创建搜索词的所有可能变体,然后返回每个变体的精确匹配。
- 参数配置 :
fuzziness
:允许的最大编辑距离,默认为AUTO
。可以设置为AUTO
或一个整数,表示允许的编辑次数。
max_expansions
:创建的变体的最大数量,默认为50。高值可能会导致性能问题,因为会检查大量的变体。
prefix_length
:在创建变体时保持不变的开始字符数,默认为0。
transpositions
:是否包括两个相邻字符的置换作为编辑,默认为true
。
rewrite
:用于重写查询的方法,如果fuzziness
参数不为0
,默认使用top_terms_blended_freqs_${max_expansions}
。
- 性能考虑 :由于
fuzzy
查询可能会生成大量的变体,特别是当max_expansions
参数值较高且prefix_length
参数值为0时,可能会影响查询性能。因此,合理配置这些参数对于保持Elasticsearch查询性能至关重要。
- 使用场景 :
fuzzy
查询是term
查询的模糊等价物,通常不会直接使用,但理解其工作原理有助于在更高级的match
查询中使用模糊匹配。
- 不进行分析 :与
match
查询不同,fuzzy
查询是一个term级别的查询,不会对查询词进行分析,它直接在词典中查找与指定fuzziness
范围内的所有项。
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"fuzzy": {
"name": "zhangsan"
}
}
}
# 返回
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.540445,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.540445,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "005",
"_score" : 1.3478894,
"_source" : {
"name" : "zhangsan2",
"nickname" : "zhangsan2",
"sex" : "女",
"age" : 30
}
}
]
}
}
(10)组合查询(bool)
- must: 相当于&& 同时成立
- should: 相当于|| 成立一个就行
- must_not: 相当于! 不能满足任何一个
bash
复制代码
# 请求
GET /teacher/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "zhangsan"
}
}
],
"must_not": [
{
"match": {
"age": "40"
}
}
],
"should": [
{
"match": {
"sex": "男"
}
}
]
}
}
}
# 返回
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 2.5700645,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 2.5700645,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
}
]
}
}
(11)多字段查询(multi_match)
multi_match
查询允许你在多个字段上执行全文检索。与match
查询类似,它也会进行分词,但是可以在多个字段上搜索给定的文本
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"multi_match": {
"query": "zhangsan",
"fields": ["name","nickname"]
}
}
}
# 返回
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.540445,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.540445,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
}
]
}
}
(12)多值查询(terms)
- terms 查询和 term 查询一样,但它允许你指定多值进行匹配。类似 SQL 中的 in。
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"terms": {
"name": ["zhangsan","lisi"]
}
}
}
# 返回
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "002",
"_score" : 1.0,
"_source" : {
"name" : "lisi",
"nickname" : "lisi",
"sex" : "男",
"age" : 20
}
}
]
}
}
(13)复杂查询(query_string)
- Elasticsearch中的
query_string
查询是一个强大的全文搜索查询,它允许用户使用一个查询字符串来构造复杂的查询。这个查询字符串可以包含多个字段、逻辑操作符(如AND, OR, NOT)以及通配符等。query_string
查询非常灵活,可以模拟其他查询类型,如match
、bool
等。
- 以下是
query_string
查询的一些关键点:
- 查询语法:
- 支持布尔操作符:
AND
, OR
, NOT
。
- 支持通配符查询:
*
(匹配任意字符)。
- 支持短语搜索:使用双引号
""
包围短语。
- 支持前缀搜索:使用问号
?
表示单个字符,使用星号*
表示任意长度的字符。
- 支持正则表达式:使用正斜杠
/
包围正则表达式。
- 字段选择:可以在查询中指定一个或多个字段进行搜索。
- 分析器:默认使用字段的搜索分析器(search analyzer),但也可以使用查询分析器(query analyzer)。
- 默认操作:默认的布尔操作是
AND
,意味着如果查询字符串中没有指定布尔操作符,那么查询会将所有条件视为AND
连接。
- 字段名和值的转义:如果需要在查询中使用特殊字符,可以使用反斜杠
\
进行转义。
- 性能:由于
query_string
查询的复杂性,它可能比简单的match
查询更消耗资源。因此,在性能敏感的应用中需要谨慎使用。
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"query_string": {
"query": "name:zhansan OR sex:男"
}
}
}
# 返回
{
"took" : 9,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0296195,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.0296195,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "002",
"_score" : 1.0296195,
"_source" : {
"name" : "lisi",
"nickname" : "lisi",
"sex" : "男",
"age" : 20
}
}
]
}
}
- 分析器的使用:这个查询使用
standard
分析器来处理查询字符串。
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"query_string": {
"query": "name:zhangsan",
"analyzer": "standard"
}
}
}
# 返回
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.540445,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.540445,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
}
]
}
}
3、文档查询结果集处理
(1)查询结果高亮(highlight)
- Elasticsearch中的高亮(highlight)功能允许你在搜索结果中突出显示与查询条件匹配的文本片段,以便用户能够快速定位到关键信息。
- 以下是关于Elasticsearch高亮功能的详细介绍:
- 高亮参数:
- Elasticsearch提供了多种高亮参数,包括字段设置、自定义标签、片段大小等。
- 默认情况下,Elasticsearch使用
<em></em>
标签来标记高亮的关键字。
- 自定义高亮片段:
- 用户可以自定义高亮标签,比如使用
<b></b>
标签来代替默认的<em></em>
标签。
- 通过
pre_tags
和post_tags
参数,用户可以指定高亮前后的标签。
- 多字段高亮:
- 高亮功能可以应用于一个或多个字段,只需在
highlight
字段中添加相应的字段名称。
- 高亮性能分析:
- 高亮功能需要字段的实际内容,如果字段未存储(映射未设置
store
为true
),则需要从_source
中加载实际值并从中提取相关字段。
- 高亮查询示例:
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"term": {
"name": {
"value": "zhangsan"
}
}
},
"highlight": {
"fields": {
"*":{}
}
}
}
# 返回
{
"took" : 33,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.540445,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.540445,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
},
"highlight" : {
"name" : [
"<em>zhangsan</em>"
]
}
}
]
}
}
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"term": {
"name": {
"value": "zhangsan"
}
}
},
"highlight": {
"pre_tags": "<font color='red'>",
"post_tags": "</font>",
"fields": {
"name": {}
}
}
}
# 返回
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.540445,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.540445,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
},
"highlight" : {
"name" : [
"<font color='red'>zhangsan</font>"
]
}
}
]
}
}
(2)指定返回条数(size)
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"match_all": {}
},
"size": 2
}
# 返回
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "002",
"_score" : 1.0,
"_source" : {
"name" : "lisi",
"nickname" : "lisi",
"sex" : "男",
"age" : 20
}
}
]
}
}
(3)分页查询(from)
- size:取多少条数据
- from:从哪条数据开始
- size+from 合起来实现分页效果
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"match_all": {}
},
"size": 2,
"from": 2
}
# 返回
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "003",
"_score" : 1.0,
"_source" : {
"name" : "wangwu1",
"nickname" : "wangwu2",
"sex" : "女",
"age" : 40
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "004",
"_score" : 1.0,
"_source" : {
"name" : "zhaoliu1",
"nickname" : "zhaoliu2",
"sex" : "女",
"age" : 50
}
}
]
}
}
(4)指定字段排序(sort)
- 文本字段(
text
字段)默认不适用于需要逐文档字段数据的操作,如聚合(aggregations)和排序(sorting)。这是因为text
字段会被分词器分词,存储的是词元(tokens),而不是完整的原始字符串。
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"terms": {
"name":["zhangsan","lisi"]
}
},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
# 返回
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : null,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan",
"sex" : "男",
"age" : 30
},
"sort" : [
30
]
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "002",
"_score" : null,
"_source" : {
"name" : "lisi",
"nickname" : "lisi",
"sex" : "男",
"age" : 20
},
"sort" : [
20
]
}
]
}
}
(5)指定返回字段(_source)
- includes:来指定想要显示的字段
- excludes:来指定不想要显示的字段
json
复制代码
# 请求
GET /teacher/_search
{
"query": {
"terms": {
"name":["zhangsan","lisi"]
}
},
"_source": {
"includes": ["name","nickname"]
}
}
# 返回
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "001",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"nickname" : "zhangsan"
}
},
{
"_index" : "teacher",
"_type" : "_doc",
"_id" : "002",
"_score" : 1.0,
"_source" : {
"name" : "lisi",
"nickname" : "lisi"
}
}
]
}
}
4、文档聚合查询
(1)返回最大值(max)
- aggs:表示聚合
- size:0 表示返回聚合结果不返回详细数据
json
复制代码
# 请求
GET /teacher/_search
{
"aggs":{
"max_age":{
"max":{"field":"age"}
}
},
"size":0
}
# 返回
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"max_age" : {
"value" : 50.0
}
}
}
(2)返回最小值(min)
json
复制代码
# 请求
GET /teacher/_search
{
"aggs":{
"min_age":{
"min":{"field":"age"}
}
},
"size":0
}
# 返回值
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"min_age" : {
"value" : 20.0
}
}
}
(3)返回平均值(avg)
json
复制代码
# 请求
GET /teacher/_search
{
"aggs":{
"avg_age":{
"avg":{"field":"age"}
}
},
"size":0
}
# 返回值
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"avg_age" : {
"value" : 33.333333333333336
}
}
}
(4)返回和(sum)
json
复制代码
# 请求
GET /teacher/_search
{
"aggs":{
"sum_age":{
"sum":{"field":"age"}
}
},
"size":0
}
# 返回值
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"sum_age" : {
"value" : 200.0
}
}
}
(5)返回去重行数(cardinality)
json
复制代码
# 请求
GET /teacher/_search
{
"aggs":{
"cardinality_age":{
"cardinality":{"field":"age"}
}
},
"size":0
}
# 返回值
{
"took" : 10,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"cardinality_age" : {
"value" : 4
}
}
}
(6)返回全部聚合(stats)
json
复制代码
# 请求
GET /teacher/_search
{
"aggs":{
"stats_age":{
"stats":{"field":"age"}
}
},
"size":0
}
# 返回值
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"stats_age" : {
"count" : 6,
"min" : 20.0,
"max" : 50.0,
"avg" : 33.333333333333336,
"sum" : 200.0
}
}
}
(7)分组聚合(terms)
json
复制代码
# 请求
GET /teacher/_search
{
"aggs":{
"terms_age":{
"terms":{"field":"age"}
}
},
"size":0
}
# 返回值
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"terms_age" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 30,
"doc_count" : 3
},
{
"key" : 20,
"doc_count" : 1
},
{
"key" : 40,
"doc_count" : 1
},
{
"key" : 50,
"doc_count" : 1
}
]
}
}
}
(8)分组指定聚合(terms+aggs)
json
复制代码
# 请求
GET /teacher/_search?size=0
{
"aggs": {
"group_by_field": {
"terms": {
"field": "age"
},
"aggs": {
"average_value": {
"avg": {
"field": "age"
}
}
}
}
}
}
# 返回值
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_field" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 30,
"doc_count" : 3,
"average_value" : {
"value" : 30.0
}
},
{
"key" : 20,
"doc_count" : 1,
"average_value" : {
"value" : 20.0
}
},
{
"key" : 40,
"doc_count" : 1,
"average_value" : {
"value" : 40.0
}
},
{
"key" : 50,
"doc_count" : 1,
"average_value" : {
"value" : 50.0
}
}
]
}
}
}
(9)分组聚合排序(terms+aggs+sort)
json
复制代码
GET /teacher/_search?size=0
{
"aggs": {
"group_by_field": {
"terms": {
"field": "age",
"order": {"average_value": "desc"}
},
"aggs": {
"average_value": {
"avg": {
"field": "age"
}
}
}
}
}
}
# 返回值
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_field" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 50,
"doc_count" : 1,
"average_value" : {
"value" : 50.0
}
},
{
"key" : 40,
"doc_count" : 1,
"average_value" : {
"value" : 40.0
}
},
{
"key" : 30,
"doc_count" : 3,
"average_value" : {
"value" : 30.0
}
},
{
"key" : 20,
"doc_count" : 1,
"average_value" : {
"value" : 20.0
}
}
]
}
}
}