MongoDB聚合框架：Java驱动下的数据聚合操作

MongoDB聚合框架概述

MongoDB聚合框架通过管道（Pipeline）处理数据，支持多阶段操作（如过滤、分组、排序等）。Java驱动通过AggregateIterable实现聚合查询，核心类为com.mongodb.client.AggregateIterable。

基础聚合操作

1. 匹配阶段（$match）

过滤文档，类似查询条件：

java 复制代码

collection.aggregate(Arrays.asList(
    Aggregates.match(Filters.eq("status", "A"))
));

2. 分组阶段（$group）

按字段分组并计算统计值：

java 复制代码

collection.aggregate(Arrays.asList(
    Aggregates.group("$category", 
        Accumulators.sum("total", "$quantity"))
));

3. 排序阶段（$sort）

指定排序规则：

java 复制代码

collection.aggregate(Arrays.asList(
    Aggregates.sort(Sorts.descending("total"))
));

高级聚合功能

1. 多阶段组合

串联多个阶段实现复杂逻辑：

java 复制代码

collection.aggregate(Arrays.asList(
    Aggregates.match(Filters.gt("price", 100)),
    Aggregates.group("$product", 
        Accumulators.avg("avgPrice", "$price")),
    Aggregates.sort(Sorts.ascending("avgPrice"))
));

2. 使用$project重塑文档

选择或计算新字段：

java 复制代码

collection.aggregate(Arrays.asList(
    Aggregates.project(Projections.fields(
        Projections.include("name"),
        Projections.computed("discountedPrice", 
            Operators.multiply("$price", 0.9))
    ))
));

3. 连接集合（$lookup）

实现类似SQL的JOIN操作：

java 复制代码

collection.aggregate(Arrays.asList(
    Aggregates.lookup("orders", "userId", "_id", "orderDetails")
));

性能优化建议

在$match阶段尽早过滤数据，减少后续处理量。

为常用聚合字段创建索引，例如：

java 复制代码

collection.createIndex(Indexes.ascending("status", "price"));

使用$allowDiskUse选项处理大型数据集，避免内存溢出：
java 复制代码
```
collection.aggregate(pipeline)
    .allowDiskUse(true);
```

异常处理

捕获MongoException处理聚合错误：www.cai111111.cn

www.jiaochongwuyiyuan.cn

www.eds-motorsport.cn

java 复制代码

try {
    AggregateIterable<Document> result = collection.aggregate(pipeline);
} catch (MongoException e) {
    System.err.println("聚合失败: " + e.getMessage());
}

结果处理

遍历聚合结果或转换为列表：

java 复制代码

MongoCursor<Document> cursor = collection.aggregate(pipeline).iterator();
while (cursor.hasNext()) {
    System.out.println(cursor.next().toJson());
}

// 或直接转为List
List<Document> results = collection.aggregate(pipeline).into(new ArrayList<>());

通过合理设计聚合管道和利用Java驱动的API，可高效实现复杂数据分析需求。