将字符串 “()“ “&“ “|“ 条件组成的复杂表达式转换为ES查询语句

应用场景

"()" "&" "|" 这几个条件对于我们来说并不陌生, 其表达的逻辑非常明了, 又能通过很少的字符表达很复杂的嵌套关系, 在一些复杂的查询中会经常用到, 因此我最近也遇到了类似的问题,一开始觉得这类的工具应该挺常见的, 结果搜了半天没有找到合适的,因此决定自己写一个

简介

此工具的复杂之处在于我们并不确定操作系统的人员会输入怎样的表达式,格式并不是固定的因此可能会书写出较为复杂的逻辑. 也有可能只嵌套一层就结束了,所以我们的代码一定要考虑的通用

此处我简单说一下它的原理, 主要是用到了一个java中栈的概念: 这个工具通过解析输入的逻辑查询字符串,使用栈来管理运算符和操作数,构建出对应的查询树,然后将其转换为Elasticsearch的多字段(如标题、摘要、正文)的搜索查询,实现复杂的逻辑查询条件的自动解析和执行。

以下代码全部都加了注释, 应该是不难理解的

代码

java 复制代码
package com.sinosoft.springbootplus.lft.business.dispatch.publicopinion.util;

import org.elasticsearch.index.query.BoolQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;

import java.util.Stack;

/**
 * 构建ES复杂查询条件,包含括号、逻辑运算符和操作符
 *
 * @author zzt
 * @date 2024-05-28
 */
public class ESQueryParserUtil {

    /**
     * 解析输入字符串并将其转换为Elasticsearch的QueryBuilder
     *
     * @param query 输入的查询字符串
     * @return Elasticsearch的QueryBuilder
     */
    public static SearchSourceBuilder parseQuery(String query) {
        // 存储运算符的栈
        Stack<Character> operators = new Stack<>();
        // 存储操作数的栈
        Stack<QueryBuilder> operands = new Stack<>();

        for (int i = 0; i < query.length(); i++) {
            char ch = query.charAt(i);

            if (ch == '(' || ch == '&' || ch == '|') {
                // 遇到左括号或者运算符时,压入运算符栈
                operators.push(ch);
            } else if (ch == ')') {
                // 遇到右括号时,弹出运算符栈中的运算符并进行计算直到遇到左括号
                while (!operators.isEmpty() && operators.peek() != '(') {
                    char operator = operators.pop();
                    QueryBuilder right = operands.pop();
                    QueryBuilder left = operands.pop();
                    operands.push(applyOperator(left, right, operator));
                }
                operators.pop(); // 弹出左括号
            } else if (Character.isLetterOrDigit(ch) || ch == ' ') {
                // 遇到字母、数字、空格或者"地区"时,构建查询字符串
                StringBuilder sb = new StringBuilder();
                while (i < query.length() && (Character.isLetterOrDigit(query.charAt(i)) || query.charAt(i) == ' ')) {
                    sb.append(query.charAt(i));
                    i++;
                }
                i--; // 回退一个字符,因为外层for循环会前进一个字符
                operands.push(QueryBuilders.multiMatchQuery(sb.toString().trim(), "title", "sysAbstract", "content"));//此处是我的ES中要模糊搜索的三个字段, 这里请自行更改
            }
        }

        // 处理剩余的运算符
        while (!operators.isEmpty()) {
            char operator = operators.pop();
            QueryBuilder right = operands.pop();
            QueryBuilder left = operands.pop();
            operands.push(applyOperator(left, right, operator));
        }

        return new SearchSourceBuilder().query(operands.pop());
    }

    /**
     * 根据运算符将两个操作数组合成一个QueryBuilder
     *
     * @param left     左操作数
     * @param right    右操作数
     * @param operator 运算符
     * @return 组合后的QueryBuilder
     */
    private static QueryBuilder applyOperator(QueryBuilder left, QueryBuilder right, char operator) {
        BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
        if (operator == '&') {
            boolQuery.must(left).must(right);
        } else if (operator == '|') {
            boolQuery.should(left).should(right);
        }
        return boolQuery;
    }

    public static void main(String[] args) {
        String query = "((北京|天津|(河北&石家庄))&(打架|辱骂|违法))&(中国)";
        SearchSourceBuilder searchSourceBuilder = parseQuery(query);
        System.out.println(searchSourceBuilder);
    }
}

生成的查询条件

由于我写的这个算是稍微复杂一点的嵌套,生成的查询条件还是挺长的, 感兴趣的可以试一下

java 复制代码
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "must": [
              {
                "bool": {
                  "should": [
                    {
                      "multi_match": {
                        "query": "北京",
                        "fields": [
                          "content^1.0",
                          "sysAbstract^1.0",
                          "title^1.0"
                        ],
                        "type": "best_fields",
                        "operator": "OR",
                        "slop": 0,
                        "prefix_length": 0,
                        "max_expansions": 50,
                        "zero_terms_query": "NONE",
                        "auto_generate_synonyms_phrase_query": true,
                        "fuzzy_transpositions": true,
                        "boost": 1.0
                      }
                    },
                    {
                      "bool": {
                        "should": [
                          {
                            "multi_match": {
                              "query": "天津",
                              "fields": [
                                "content^1.0",
                                "sysAbstract^1.0",
                                "title^1.0"
                              ],
                              "type": "best_fields",
                              "operator": "OR",
                              "slop": 0,
                              "prefix_length": 0,
                              "max_expansions": 50,
                              "zero_terms_query": "NONE",
                              "auto_generate_synonyms_phrase_query": true,
                              "fuzzy_transpositions": true,
                              "boost": 1.0
                            }
                          },
                          {
                            "bool": {
                              "must": [
                                {
                                  "multi_match": {
                                    "query": "河北",
                                    "fields": [
                                      "content^1.0",
                                      "sysAbstract^1.0",
                                      "title^1.0"
                                    ],
                                    "type": "best_fields",
                                    "operator": "OR",
                                    "slop": 0,
                                    "prefix_length": 0,
                                    "max_expansions": 50,
                                    "zero_terms_query": "NONE",
                                    "auto_generate_synonyms_phrase_query": true,
                                    "fuzzy_transpositions": true,
                                    "boost": 1.0
                                  }
                                },
                                {
                                  "multi_match": {
                                    "query": "石家庄",
                                    "fields": [
                                      "content^1.0",
                                      "sysAbstract^1.0",
                                      "title^1.0"
                                    ],
                                    "type": "best_fields",
                                    "operator": "OR",
                                    "slop": 0,
                                    "prefix_length": 0,
                                    "max_expansions": 50,
                                    "zero_terms_query": "NONE",
                                    "auto_generate_synonyms_phrase_query": true,
                                    "fuzzy_transpositions": true,
                                    "boost": 1.0
                                  }
                                }
                              ],
                              "adjust_pure_negative": true,
                              "boost": 1.0
                            }
                          }
                        ],
                        "adjust_pure_negative": true,
                        "boost": 1.0
                      }
                    }
                  ],
                  "adjust_pure_negative": true,
                  "boost": 1.0
                }
              },
              {
                "bool": {
                  "should": [
                    {
                      "multi_match": {
                        "query": "打架",
                        "fields": [
                          "content^1.0",
                          "sysAbstract^1.0",
                          "title^1.0"
                        ],
                        "type": "best_fields",
                        "operator": "OR",
                        "slop": 0,
                        "prefix_length": 0,
                        "max_expansions": 50,
                        "zero_terms_query": "NONE",
                        "auto_generate_synonyms_phrase_query": true,
                        "fuzzy_transpositions": true,
                        "boost": 1.0
                      }
                    },
                    {
                      "bool": {
                        "should": [
                          {
                            "multi_match": {
                              "query": "辱骂",
                              "fields": [
                                "content^1.0",
                                "sysAbstract^1.0",
                                "title^1.0"
                              ],
                              "type": "best_fields",
                              "operator": "OR",
                              "slop": 0,
                              "prefix_length": 0,
                              "max_expansions": 50,
                              "zero_terms_query": "NONE",
                              "auto_generate_synonyms_phrase_query": true,
                              "fuzzy_transpositions": true,
                              "boost": 1.0
                            }
                          },
                          {
                            "multi_match": {
                              "query": "违法",
                              "fields": [
                                "content^1.0",
                                "sysAbstract^1.0",
                                "title^1.0"
                              ],
                              "type": "best_fields",
                              "operator": "OR",
                              "slop": 0,
                              "prefix_length": 0,
                              "max_expansions": 50,
                              "zero_terms_query": "NONE",
                              "auto_generate_synonyms_phrase_query": true,
                              "fuzzy_transpositions": true,
                              "boost": 1.0
                            }
                          }
                        ],
                        "adjust_pure_negative": true,
                        "boost": 1.0
                      }
                    }
                  ],
                  "adjust_pure_negative": true,
                  "boost": 1.0
                }
              }
            ],
            "adjust_pure_negative": true,
            "boost": 1.0
          }
        },
        {
          "multi_match": {
            "query": "中国",
            "fields": [
              "content^1.0",
              "sysAbstract^1.0",
              "title^1.0"
            ],
            "type": "best_fields",
            "operator": "OR",
            "slop": 0,
            "prefix_length": 0,
            "max_expansions": 50,
            "zero_terms_query": "NONE",
            "auto_generate_synonyms_phrase_query": true,
            "fuzzy_transpositions": true,
            "boost": 1.0
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  }
}
相关推荐
逊嘘6 分钟前
【Java语言】抽象类与接口
java·开发语言·jvm
morris13113 分钟前
【SpringBoot】Xss的常见攻击方式与防御手段
java·spring boot·xss·csp
xmst17 分钟前
短视频如何引流?抖音小红书视频号的引流策略
搜索引擎
七星静香38 分钟前
laravel chunkById 分块查询 使用时的问题
java·前端·laravel
Jacob程序员39 分钟前
java导出word文件(手绘)
java·开发语言·word
ZHOUPUYU39 分钟前
IntelliJ IDEA超详细下载安装教程(附安装包)
java·ide·intellij-idea
stewie642 分钟前
在IDEA中使用Git
java·git
Elaine2023911 小时前
06 网络编程基础
java·网络
G丶AEOM1 小时前
分布式——BASE理论
java·分布式·八股
落落鱼20131 小时前
tp接口 入口文件 500 错误原因
java·开发语言