将字符串 “()“ “&“ “|“ 条件组成的复杂表达式转换为ES查询语句

应用场景

"()" "&" "|" 这几个条件对于我们来说并不陌生, 其表达的逻辑非常明了, 又能通过很少的字符表达很复杂的嵌套关系, 在一些复杂的查询中会经常用到, 因此我最近也遇到了类似的问题,一开始觉得这类的工具应该挺常见的, 结果搜了半天没有找到合适的,因此决定自己写一个

简介

此工具的复杂之处在于我们并不确定操作系统的人员会输入怎样的表达式,格式并不是固定的因此可能会书写出较为复杂的逻辑. 也有可能只嵌套一层就结束了,所以我们的代码一定要考虑的通用

此处我简单说一下它的原理, 主要是用到了一个java中栈的概念: 这个工具通过解析输入的逻辑查询字符串,使用栈来管理运算符和操作数,构建出对应的查询树,然后将其转换为Elasticsearch的多字段(如标题、摘要、正文)的搜索查询,实现复杂的逻辑查询条件的自动解析和执行。

以下代码全部都加了注释, 应该是不难理解的

代码

java 复制代码
package com.sinosoft.springbootplus.lft.business.dispatch.publicopinion.util;

import org.elasticsearch.index.query.BoolQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;

import java.util.Stack;

/**
 * 构建ES复杂查询条件,包含括号、逻辑运算符和操作符
 *
 * @author zzt
 * @date 2024-05-28
 */
public class ESQueryParserUtil {

    /**
     * 解析输入字符串并将其转换为Elasticsearch的QueryBuilder
     *
     * @param query 输入的查询字符串
     * @return Elasticsearch的QueryBuilder
     */
    public static SearchSourceBuilder parseQuery(String query) {
        // 存储运算符的栈
        Stack<Character> operators = new Stack<>();
        // 存储操作数的栈
        Stack<QueryBuilder> operands = new Stack<>();

        for (int i = 0; i < query.length(); i++) {
            char ch = query.charAt(i);

            if (ch == '(' || ch == '&' || ch == '|') {
                // 遇到左括号或者运算符时,压入运算符栈
                operators.push(ch);
            } else if (ch == ')') {
                // 遇到右括号时,弹出运算符栈中的运算符并进行计算直到遇到左括号
                while (!operators.isEmpty() && operators.peek() != '(') {
                    char operator = operators.pop();
                    QueryBuilder right = operands.pop();
                    QueryBuilder left = operands.pop();
                    operands.push(applyOperator(left, right, operator));
                }
                operators.pop(); // 弹出左括号
            } else if (Character.isLetterOrDigit(ch) || ch == ' ') {
                // 遇到字母、数字、空格或者"地区"时,构建查询字符串
                StringBuilder sb = new StringBuilder();
                while (i < query.length() && (Character.isLetterOrDigit(query.charAt(i)) || query.charAt(i) == ' ')) {
                    sb.append(query.charAt(i));
                    i++;
                }
                i--; // 回退一个字符,因为外层for循环会前进一个字符
                operands.push(QueryBuilders.multiMatchQuery(sb.toString().trim(), "title", "sysAbstract", "content"));//此处是我的ES中要模糊搜索的三个字段, 这里请自行更改
            }
        }

        // 处理剩余的运算符
        while (!operators.isEmpty()) {
            char operator = operators.pop();
            QueryBuilder right = operands.pop();
            QueryBuilder left = operands.pop();
            operands.push(applyOperator(left, right, operator));
        }

        return new SearchSourceBuilder().query(operands.pop());
    }

    /**
     * 根据运算符将两个操作数组合成一个QueryBuilder
     *
     * @param left     左操作数
     * @param right    右操作数
     * @param operator 运算符
     * @return 组合后的QueryBuilder
     */
    private static QueryBuilder applyOperator(QueryBuilder left, QueryBuilder right, char operator) {
        BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
        if (operator == '&') {
            boolQuery.must(left).must(right);
        } else if (operator == '|') {
            boolQuery.should(left).should(right);
        }
        return boolQuery;
    }

    public static void main(String[] args) {
        String query = "((北京|天津|(河北&石家庄))&(打架|辱骂|违法))&(中国)";
        SearchSourceBuilder searchSourceBuilder = parseQuery(query);
        System.out.println(searchSourceBuilder);
    }
}

生成的查询条件

由于我写的这个算是稍微复杂一点的嵌套,生成的查询条件还是挺长的, 感兴趣的可以试一下

java 复制代码
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "must": [
              {
                "bool": {
                  "should": [
                    {
                      "multi_match": {
                        "query": "北京",
                        "fields": [
                          "content^1.0",
                          "sysAbstract^1.0",
                          "title^1.0"
                        ],
                        "type": "best_fields",
                        "operator": "OR",
                        "slop": 0,
                        "prefix_length": 0,
                        "max_expansions": 50,
                        "zero_terms_query": "NONE",
                        "auto_generate_synonyms_phrase_query": true,
                        "fuzzy_transpositions": true,
                        "boost": 1.0
                      }
                    },
                    {
                      "bool": {
                        "should": [
                          {
                            "multi_match": {
                              "query": "天津",
                              "fields": [
                                "content^1.0",
                                "sysAbstract^1.0",
                                "title^1.0"
                              ],
                              "type": "best_fields",
                              "operator": "OR",
                              "slop": 0,
                              "prefix_length": 0,
                              "max_expansions": 50,
                              "zero_terms_query": "NONE",
                              "auto_generate_synonyms_phrase_query": true,
                              "fuzzy_transpositions": true,
                              "boost": 1.0
                            }
                          },
                          {
                            "bool": {
                              "must": [
                                {
                                  "multi_match": {
                                    "query": "河北",
                                    "fields": [
                                      "content^1.0",
                                      "sysAbstract^1.0",
                                      "title^1.0"
                                    ],
                                    "type": "best_fields",
                                    "operator": "OR",
                                    "slop": 0,
                                    "prefix_length": 0,
                                    "max_expansions": 50,
                                    "zero_terms_query": "NONE",
                                    "auto_generate_synonyms_phrase_query": true,
                                    "fuzzy_transpositions": true,
                                    "boost": 1.0
                                  }
                                },
                                {
                                  "multi_match": {
                                    "query": "石家庄",
                                    "fields": [
                                      "content^1.0",
                                      "sysAbstract^1.0",
                                      "title^1.0"
                                    ],
                                    "type": "best_fields",
                                    "operator": "OR",
                                    "slop": 0,
                                    "prefix_length": 0,
                                    "max_expansions": 50,
                                    "zero_terms_query": "NONE",
                                    "auto_generate_synonyms_phrase_query": true,
                                    "fuzzy_transpositions": true,
                                    "boost": 1.0
                                  }
                                }
                              ],
                              "adjust_pure_negative": true,
                              "boost": 1.0
                            }
                          }
                        ],
                        "adjust_pure_negative": true,
                        "boost": 1.0
                      }
                    }
                  ],
                  "adjust_pure_negative": true,
                  "boost": 1.0
                }
              },
              {
                "bool": {
                  "should": [
                    {
                      "multi_match": {
                        "query": "打架",
                        "fields": [
                          "content^1.0",
                          "sysAbstract^1.0",
                          "title^1.0"
                        ],
                        "type": "best_fields",
                        "operator": "OR",
                        "slop": 0,
                        "prefix_length": 0,
                        "max_expansions": 50,
                        "zero_terms_query": "NONE",
                        "auto_generate_synonyms_phrase_query": true,
                        "fuzzy_transpositions": true,
                        "boost": 1.0
                      }
                    },
                    {
                      "bool": {
                        "should": [
                          {
                            "multi_match": {
                              "query": "辱骂",
                              "fields": [
                                "content^1.0",
                                "sysAbstract^1.0",
                                "title^1.0"
                              ],
                              "type": "best_fields",
                              "operator": "OR",
                              "slop": 0,
                              "prefix_length": 0,
                              "max_expansions": 50,
                              "zero_terms_query": "NONE",
                              "auto_generate_synonyms_phrase_query": true,
                              "fuzzy_transpositions": true,
                              "boost": 1.0
                            }
                          },
                          {
                            "multi_match": {
                              "query": "违法",
                              "fields": [
                                "content^1.0",
                                "sysAbstract^1.0",
                                "title^1.0"
                              ],
                              "type": "best_fields",
                              "operator": "OR",
                              "slop": 0,
                              "prefix_length": 0,
                              "max_expansions": 50,
                              "zero_terms_query": "NONE",
                              "auto_generate_synonyms_phrase_query": true,
                              "fuzzy_transpositions": true,
                              "boost": 1.0
                            }
                          }
                        ],
                        "adjust_pure_negative": true,
                        "boost": 1.0
                      }
                    }
                  ],
                  "adjust_pure_negative": true,
                  "boost": 1.0
                }
              }
            ],
            "adjust_pure_negative": true,
            "boost": 1.0
          }
        },
        {
          "multi_match": {
            "query": "中国",
            "fields": [
              "content^1.0",
              "sysAbstract^1.0",
              "title^1.0"
            ],
            "type": "best_fields",
            "operator": "OR",
            "slop": 0,
            "prefix_length": 0,
            "max_expansions": 50,
            "zero_terms_query": "NONE",
            "auto_generate_synonyms_phrase_query": true,
            "fuzzy_transpositions": true,
            "boost": 1.0
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  }
}
相关推荐
空の鱼3 小时前
java开发,IDEA转战VSCODE配置(mac)
java·vscode
P7进阶路4 小时前
Tomcat异常日志中文乱码怎么解决
java·tomcat·firefox
小丁爱养花5 小时前
Spring MVC:HTTP 请求的参数传递2.0
java·后端·spring
CodeClimb5 小时前
【华为OD-E卷 - 第k个排列 100分(python、java、c++、js、c)】
java·javascript·c++·python·华为od
等一场春雨5 小时前
Java设计模式 九 桥接模式 (Bridge Pattern)
java·设计模式·桥接模式
带刺的坐椅5 小时前
[Java] Solon 框架的三大核心组件之一插件扩展体系
java·ioc·solon·plugin·aop·handler
Dusk_橙子6 小时前
在elasticsearch中,document数据的写入流程如何?
大数据·elasticsearch·搜索引擎
不惑_6 小时前
深度学习 · 手撕 DeepLearning4J ,用Java实现手写数字识别 (附UI效果展示)
java·深度学习·ui
费曼乐园6 小时前
Kafka中bin目录下面kafka-run-class.sh脚本中的JAVA_HOME
java·kafka
feilieren7 小时前
SpringBoot 搭建 SSE
java·spring boot·spring