天猫返利app搜索系统优化:基于Elasticsearch的商品导购引擎设计

天猫返利app搜索系统优化:基于Elasticsearch的商品导购引擎设计

大家好,我是省赚客APP研发者阿可!在省赚客APP(juwatech.cn)中,用户通过关键词快速查找高返利天猫商品是核心场景。面对亿级商品数据、毫秒级响应要求及复杂排序策略(如"高佣金+高销量+低价格"),我们基于 Elasticsearch 构建了高性能商品导购引擎。本文将从索引设计、查询构建、相关性调优到工程集成,结合实际代码详解关键实现。

商品索引结构设计:多字段与嵌套属性支持

我们为商品建立 tmall_products 索引,兼顾全文检索、精确过滤与聚合分析:

json 复制代码
PUT /tmall_products
{
  "settings": {
    "number_of_shards": 6,
    "number_of_replicas": 1,
    "analysis": {
      "analyzer": {
        "ik_max_word_pinyin": {
          "tokenizer": "ik_max_word",
          "filter": ["pinyin_filter"]
        }
      },
      "filter": {
        "pinyin_filter": {
          "type": "pinyin",
          "keep_full_pinyin": true,
          "keep_original": true
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "itemId": { "type": "keyword" },
      "title": {
        "type": "text",
        "analyzer": "ik_max_word",
        "search_analyzer": "ik_smart",
        "fields": {
          "pinyin": { "type": "text", "analyzer": "ik_max_word_pinyin" }
        }
      },
      "categoryId": { "type": "keyword" },
      "brandId": { "type": "keyword" },
      "price": { "type": "scaled_float", "scaling_factor": 100 },
      "salesCount": { "type": "integer" },
      "commissionRate": { "type": "scaled_float", "scaling_factor": 10000 },
      "rebateAmount": { "type": "scaled_float", "scaling_factor": 100 },
      "tags": { "type": "keyword" },
      "specs": {
        "type": "nested",
        "properties": {
          "key": { "type": "keyword" },
          "value": { "type": "keyword" }
        }
      },
      "status": { "type": "byte" }
    }
  }
}

Java实体与Repository封装

使用 Spring Data Elasticsearch 定义文档模型:

java 复制代码
package juwatech.cn.entity;

import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.*;

@Document(indexName = "tmall_products")
public class TmallProductDoc {
    @Id
    private String itemId;
    
    @Field(type = FieldType.Text, analyzer = "ik_max_word", searchAnalyzer = "ik_smart")
    private String title;
    
    @Field(type = FieldType.Keyword)
    private String categoryId;
    
    @Field(type = FieldType.Scaled_Float, scalingFactor = 100)
    private BigDecimal price;
    
    @Field(type = FieldType.Integer)
    private Integer salesCount;
    
    @Field(type = FieldType.Scaled_Float, scalingFactor = 10000)
    private BigDecimal commissionRate;
    
    @Field(type = FieldType.Nested)
    private List<Spec> specs;
    
    // getters/setters
}

多条件组合查询构建

通过 BoolQueryBuilder 实现关键词、类目、价格区间、规格等混合筛选:

java 复制代码
package juwatech.cn.service;

@Service
public class ProductSearchService {

    @Autowired
    private ElasticsearchRestTemplate elasticsearchTemplate;

    public SearchPage<TmallProductDoc> search(SearchRequest req) {
        NativeSearchQueryBuilder builder = new NativeSearchQueryBuilder();

        // 1. 全文检索(支持中文+拼音)
        if (StringUtils.isNotBlank(req.getKeyword())) {
            BoolQueryBuilder textQuery = QueryBuilders.boolQuery()
                .should(QueryBuilders.matchQuery("title", req.getKeyword()))
                .should(QueryBuilders.matchQuery("title.pinyin", req.getKeyword()));
            builder.withQuery(textQuery);
        }

        // 2. 类目过滤
        if (req.getCategoryId() != null) {
            builder.withFilter(QueryBuilders.termQuery("categoryId", req.getCategoryId()));
        }

        // 3. 价格区间
        if (req.getMinPrice() != null || req.getMaxPrice() != null) {
            RangeQueryBuilder priceRange = QueryBuilders.rangeQuery("price");
            if (req.getMinPrice() != null) priceRange.gte(req.getMinPrice());
            if (req.getMaxPrice() != null) priceRange.lte(req.getMaxPrice());
            builder.withFilter(priceRange);
        }

        // 4. 规格筛选(Nested Query)
        if (req.getSpecFilters() != null) {
            for (Map.Entry<String, String> spec : req.getSpecFilters().entrySet()) {
                NestedQueryBuilder nestedQuery = QueryBuilders.nestedQuery("specs",
                    QueryBuilders.boolQuery()
                        .must(QueryBuilders.termQuery("specs.key", spec.getKey()))
                        .must(QueryBuilders.termQuery("specs.value", spec.getValue())),
                    ScoreMode.None);
                builder.withFilter(nestedQuery);
            }
        }

        // 5. 排序策略
        builder.withSort(SortBuilders.scriptSort(
            new Script(
                ScriptType.INLINE,
                "painless",
                "doc['rebateAmount'].value * 0.6 + doc['salesCount'].value / 10000.0 * 0.4",
                Collections.emptyMap()
            ),
            ScriptSortType.NUMBER
        ).order(SortOrder.DESC));

        builder.withPageable(PageRequest.of(req.getPage(), req.getSize()));
        return elasticsearchTemplate.search(builder.build(), TmallProductDoc.class);
    }
}

相关性评分定制:业务加权融合

默认 TF-IDF 不符合导购场景,我们通过 function_score 融合文本匹配与业务指标:

java 复制代码
FunctionScoreQueryBuilder functionScore = QueryBuilders.functionScoreQuery(
    textQuery, // 基础查询
    new FunctionScoreQueryBuilder.FilterFunctionBuilder[] {
        new FunctionScoreQueryBuilder.FilterFunctionBuilder(
            ScoreFunctionBuilders.fieldValueFactorFunction("rebateAmount").factor(1.0f)
        ),
        new FunctionScoreQueryBuilder.FilterFunctionBuilder(
            ScoreFunctionBuilders.fieldValueFactorFunction("salesCount").factor(0.0001f)
        )
    }
).boostMode(CombineFunction.MULTIPLY);

builder.withQuery(functionScore);

数据同步机制:近实时更新

商品数据变更通过 Canal 监听 MySQL Binlog,经 Kafka 异步写入 ES:

java 复制代码
@KafkaListener(topics = "tmall_product_update")
public void handleProductUpdate(ConsumerRecord<String, String> record) {
    TmallProduct product = JsonUtil.fromJson(record.value(), TmallProduct.class);
    TmallProductDoc doc = convertToDoc(product);
    elasticsearchTemplate.save(doc);
}

下架商品立即删除:

java 复制代码
if (product.getStatus() == 0) {
    elasticsearchTemplate.delete(doc.getItemId(), TmallProductDoc.class);
}

本文著作权归聚娃科技省赚客app开发者团队,转载请注明出处!

相关推荐
予枫的编程笔记几秒前
Elasticsearch深度搜索与查询DSL实战:精准定位数据的核心技法
java·大数据·人工智能·elasticsearch·搜索引擎·全文检索
新钛云服1 分钟前
Grafana Polystat面板与腾讯云可观测平台的深度融合实践
大数据·云计算·腾讯云·grafana
小北方城市网1 分钟前
第 6 课:云原生架构终极落地|K8s 全栈编排与高可用架构设计实战
大数据·人工智能·python·云原生·架构·kubernetes·geo
青主创享阁12 分钟前
技术破局农业利润困局:玄晶引擎AI数字化解决方案的架构设计与落地实践
大数据·人工智能
正在走向自律15 分钟前
大数据时代时序数据库选型指南:为何Apache IoTDB成为物联网场景首
大数据·时序数据库·apache iotdb
Justice Young33 分钟前
Hive第五章:Integeration with HBase
大数据·数据仓库·hive·hbase
天远Date Lab34 分钟前
Python金融风控实战:集成天远多头借贷行业风险版API实现共债预警
大数据·python
Justice Young37 分钟前
Hive第三章:HQL的使用
大数据·数据仓库·hive·hadoop
qq_124987075344 分钟前
基于Spring Boot的电影票网上购票系统的设计与实现(源码+论文+部署+安装)
java·大数据·spring boot·后端·spring·毕业设计·计算机毕业设计
麦兜*1 小时前
【Spring Boot】 接口性能优化“十板斧”:从数据库连接到 JVM 调优的全链路提升
java·大数据·数据库·spring boot·后端·spring cloud·性能优化