袁庭新ES系列12节 | Elasticsearch高级查询操作

前言

上篇文章讲了关于Elasticsearch的基本查询操作。接下来袁老师为大家带来Elasticsearch高级查询部分相关的内容。Elasticsearch是基于JSON提供完整的查询DSL(Domain Specific Language:领域特定语言)来定义查询。因此,我们有必要在专题模块来详细探讨Elasticsearch高级查询部分内容。

我们先来做个热身,了解下这一小节学习的目标,我将带领大家从以下五个模块来学习Elasticsearch的高级查询相关技术。

  • 结果过滤查询
  • 条件过滤查询
  • 结果排序
  • 分页查询
  • 高亮显示

一. 结果过滤查询

默认情况下,Elasticsearch在搜索的结果中,会把文档中保存在_source的所有字段都返回。

如果我们只想获取其中的部分字段,我们可以添加_source属性来进行过滤。

1.直接指定字段

演示示例:

复制代码
GET /yx/_search
{
  "_source": ["title", "price"],
  "query": {
    "term": {
      "price": 2699
    }
  }
}

语法说明:在查询结构中,通过_source属性来指定查询结果集中需要保留哪些字段信息。

响应结果:

复制代码
{
  "took": 90,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "lNC7KYUB35ub5htYEZMU",
        "_score": 1,
        "_source": {
          "price": 2699,
          "title": "小米手机"
        }
      }
    ]
  }
}

运行上述代码响应结果见下:

2.指定includes和excludes

我们也可以通过:

|----------|-------------|
| 属性 | 描述 |
| includes | 来指定想要显示的字段 |
| excludes | 来指定不想要显示的字段 |

注意:二者都是可选的。

演示示例:

复制代码
GET /yx/_search
{
  "_source": {
    "includes": ["title", "images"]
  },
  "query": {
    "term": {
      "price": 2699
    }
  }
}

响应结果:

复制代码
{
  "took": 148,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "lNC7KYUB35ub5htYEZMU",
        "_score": 1,
        "_source": {
          "images": "http://image.yx.com/12479122.jpg",
          "title": "小米手机"
        }
      }
    ]
  }
}

运行上述代码响应结果见下:

下面的示例与上面的结果将是一样的:

复制代码
GET /yx/_search
{
  "_source": {
    "excludes": ["price"]
  },
  "query": {
    "term": {
      "price": 2699
    }
  }
}

响应结果:

复制代码
{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "lNC7KYUB35ub5htYEZMU",
        "_score": 1,
        "_source": {
          "images": "http://image.yx.com/12479122.jpg",
          "title": "小米手机"
        }
      }
    ]
  }
}

运行上述代码响应结果见下:

二. filter过滤

Elasticsearch使用的查询语言(DSL)拥有一套查询组件,这些组件可以以无限组合的方式进行搭配。这套组件可以在以下两种情况下使用:过滤情况(filtering context)和查询情况(query context)。

如何选择查询与过滤?通常的规则是,使用查询(query)语句来进行全文搜索或者其它任何需要影响相关性得分的搜索。 除此以外的情况都使用过滤(filters)。

1.条件查询中进行过滤

所有的查询都会影响到文档的评分及排名。如果我们需要在查询结果中进行过滤,并且不希望过滤条件影响评分,那么就不要把过滤条件作为查询条件来用。而是使用filter方式:

复制代码
GET /yx/_search
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "title": "小米手机"
        }
      },
      "filter": {
        "range": {
          "price": {
            "gt": 2000.00, 
            "lt": 3800.00
          }
        }
      }
    }
  }
}

响应结果:

复制代码
{
  "took": 71,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.1143606,
    "hits": [
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "lNC7KYUB35ub5htYEZMU",
        "_score": 1.1143606,
        "_source": {
          "title": "小米手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 2699
        }
      },
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "title": "大米手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 2899
        }
      }
    ]
  }
}

运行上述代码响应结果见下:

2.无查询条件直接过滤

如果一次查询只有过滤,没有查询条件,不希望进行评分,我们可以使用constant_score取代只有filter语句的bool查询。在性能上是完全相同的,但对于提高查询简洁性和清晰度有很大帮助。

复制代码
GET /yx/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "price": {
            "gt": 2000.00, 
            "lt": 3800.00
          }
        }
      }
    }
  }
}

响应结果:

复制代码
{
  "took": 16,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "lNC7KYUB35ub5htYEZMU",
        "_score": 1,
        "_source": {
          "title": "小米手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 2699
        }
      },
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "1",
        "_score": 1,
        "_source": {
          "title": "大米手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 2899
        }
      }
    ]
  }
}

运行上述代码响应结果见下:

三. 结果排序

1.单字段排序

sort可以让我们按照不同的字段进行排序,并且通过order属性指定排序的方式。

|--------|--------|
| 属性 | 描述 |
| asc | 升序排序 |
| desc | 降序排序 |

演示案例:

复制代码
GET /yx/_search
{
  "query": {
    "match": {
      "title": "小米手机"
    }
  },
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}

响应结果:

复制代码
{
  "took": 31,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 5,
    "max_score": null,
    "hits": [
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "4",
        "_score": null,
        "_source": {
          "title": "Apple手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 6899
        },
        "sort": [
          6899
        ]
      },
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "2",
        "_score": null,
        "_source": {
          "title": "IPhone手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 6299,
          "stock": 200,
          "saleable": true,
          "subTitle": "IPhone 15 Pro"
        },
        "sort": [
          6299
        ]
      },
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "5",
        "_score": null,
        "_source": {
          "title": "小米电视4A",
          "images": "http://images.com",
          "price": 3999
        },
        "sort": [
          3999
        ]
      },
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "1",
        "_score": null,
        "_source": {
          "title": "大米手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 2899
        },
        "sort": [
          2899
        ]
      },
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "lNC7KYUB35ub5htYEZMU",
        "_score": null,
        "_source": {
          "title": "小米手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 2699
        },
        "sort": [
          2699
        ]
      }
    ]
  }
}

2.多字段排序

假定我们想要结合使用price和_score(得分)进行查询,并且匹配的结果首先按照价格排序,然后再按照相关性得分降序排序:

复制代码
GET /yx/_search
{
  "query": {
    "bool": {
      "must": { 
        "match": { 
          "title": "小米手机" 
        }
      },
      "filter": {
        "range": {
          "price": {
            "gt": 2000,
            "lt": 3000
          }
        }
      }
    }
  },
  "sort": [
    { 
      "price": { 
        "order": "desc" 
      }
    },
    { 
      "_score": { 
        "order": "desc" 
      }
    }
  ]
}

响应结果:

复制代码
{
  "took": 10,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": null,
    "hits": [
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "title": "大米手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 2899
        },
        "sort": [
          2899,
          0.2876821
        ]
      },
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "lNC7KYUB35ub5htYEZMU",
        "_score": 1.1143606,
        "_source": {
          "title": "小米手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 2699
        },
        "sort": [
          2699,
          1.1143606
        ]
      }
    ]
  }
}

四. 分页查询

Elasticsearch中数据都存储在分片中,当执行搜索时每个分片独立搜索后,数据再经过整合返回。那么,如果要实现分页查询该怎么办呢?

Elasticsearch的分页与MySQL数据库非常相似,都是指定两个值:

|--------|-------------------------|
| 属性 | 描述 |
| from | 目标数据的偏移值(开始位置),默认from为0 |
| size | 每页大小 |

演示案例:

复制代码
GET /yx/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ],
  "from": 1,
  "size": 3
}

响应结果:

复制代码
{
  "took": 17,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 5,
    "max_score": null,
    "hits": [
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "2",
        "_score": null,
        "_source": {
          "title": "IPhone手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 6299,
          "stock": 200,
          "saleable": true,
          "subTitle": "IPhone 15 Pro"
        },
        "sort": [
          6299
        ]
      },
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "3",
        "_score": null,
        "_source": {
          "title": "小米电视4A",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 3899
        },
        "sort": [
          3899
        ]
      }
    ]
  }
}

五. 高亮显示

1.高亮显示原理

高亮显示的原理介绍见下:

  • 服务端搜索数据,得到搜索结果。
  • 把搜索结果中,搜索关键字都加上约定好的标签。
  • 前端页面提前写好标签的CSS样式,即可高亮显示。

Elasticsearch中实现高亮的语法比较简单,高亮显示语法格式见下:

复制代码
GET /索引库名/_search
{
  "query": {
    "match": {
      "字段": "字段值"
    }
  },
  "highlight": {
    "pre_tags": "前置标签",
    "post_tags": "后置标签",
    "fields": {
      "高亮字段名": {}
    }
  }
}

在使用match查询的同时,加上一个highlight属性。highlight属性提供以下属性:

|-----------|----------------------------|
| 属性 | 描述 |
| pre_tags | 前置标签 |
| post_tags | 后置标签 |
| fields | 需要高亮的字段(例如这里声明title字段需要高亮) |

2.高亮显示案例

演示案例:

复制代码
GET /yx/_search
{
  "query": {
    "match": {
      "title": "手机"
    }
  },
  "highlight": {
    "pre_tags": "<span>",
    "post_tags": "</span>",
    "fields": {
      "title": {}
    }
  }
}

响应结果:

复制代码
{
  "took": 19,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "title": "大米手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 2899
        },
        "highlight": {
          "title": [
            "大米<span>手机</span>"
          ]
        }
      },
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "lNC7KYUB35ub5htYEZMU",
        "_score": 0.13353139,
        "_source": {
          "title": "小米手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 2699
        },
        "highlight": {
          "title": [
            "小米<span>手机</span>"
          ]
        }
      },
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "2",
        "_score": 0.13353139,
        "_source": {
          "title": "IPhone手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 6299,
          "stock": 200,
          "saleable": true,
          "subTitle": "IPhone 15 Pro"
        },
        "highlight": {
          "title": [
            "IPhone<span>手机</span>"
          ]
        }
      },
      {
        "_index": "yx",
        "_type": "goods",
        "_id": "4",
        "_score": 0.13353139,
        "_source": {
          "title": "Apple手机",
          "images": "http://image.yx.com/12479122.jpg",
          "price": 6899
        },
        "highlight": {
          "title": [
            "Apple<span>手机</span>"
          ]
        }
      }
    ]
  }
}

运行上述代码响应结果见下:

六. 结语

关于Elasticsearch高级查询篇相关的内容我们就给大家介绍完了,来复习回顾下这一章节的主要内容。本文从结果过滤查询、结果排序、分页查询、检索查询、关键字查询、高亮显示、过滤查询等几个方面通过实例讲解了Elasticsearch的高级查询。如果还没有掌握的小伙伴,一定要通过文章中大量的案例来进行实操演练从而巩固这一部分知识。下一小节我们将为大家带来Elasticsearch中聚合操作相关的内容。

今天的内容就分享到这里吧。关注「袁庭新」,干货天天都不断!

相关推荐
不辉放弃18 分钟前
java连数据库
java·mysql
心碎土豆块1 小时前
MapReduce打包运行
大数据·mapreduce
VirusVIP1 小时前
Windows CMD通过adb检查触摸屏Linux驱动是否被编译
linux·运维·adb
chennalC#c.h.JA Ptho1 小时前
ubuntu studio 系统详解
linux·运维·服务器·经验分享·ubuntu·系统安全
元6335 小时前
Spark 缓存(Caching)
大数据·spark
yt948325 小时前
Docker-基础(数据卷、自定义镜像、Compose)
运维·docker·容器
麻芝汤圆6 小时前
MapReduce 入门实战:WordCount 程序
大数据·前端·javascript·ajax·spark·mapreduce
IvanCodes7 小时前
五、Hadoop集群部署:从零搭建三节点Hadoop环境(保姆级教程)
大数据·hadoop·分布式
水银嘻嘻7 小时前
web 自动化之 KDT 关键字驱动详解
运维·自动化
富能量爆棚8 小时前
spark-local模式
大数据