MongoDB(96)如何使用MongoDB的高级聚合功能?

MongoDB 的高级聚合框架(Aggregation Framework)提供了一种强大且灵活的方法来处理和分析数据。通过使用聚合管道(Aggregation Pipeline),你可以执行复杂的数据处理操作,如过滤、排序、分组和计算聚合结果等。

一、聚合管道简介

聚合管道由一系列阶段(stage)组成,每个阶段将输入文档转换为输出文档,然后将输出文档传递给下一个阶段。以下是一些常见的聚合阶段:

  1. $match:过滤文档(相当于 SQL 中的 WHERE 子句)。
  2. $group:分组文档并计算聚合结果(相当于 SQL 中的 GROUP BY 子句)。
  3. $project:重新格式化文档,选择或计算新字段。
  4. $sort:排序文档。
  5. $limit:限制返回的文档数。
  6. $skip:跳过指定数量的文档。
  7. $lookup:联表操作(相当于 SQL 中的 JOIN)。
  8. $unwind:展开数组字段。

二、示例数据集

假设我们有以下示例数据集:

json 复制代码
[
  { "_id": 1, "name": "Alice", "age": 25, "city": "New York", "hobbies": ["reading", "hiking"], "score": 85 },
  { "_id": 2, "name": "Bob", "age": 30, "city": "San Francisco", "hobbies": ["cooking", "gaming"], "score": 90 },
  { "_id": 3, "name": "Charlie", "age": 35, "city": "New York", "hobbies": ["hiking", "traveling"], "score": 75 },
  { "_id": 4, "name": "David", "age": 40, "city": "San Francisco", "hobbies": ["reading", "cooking"], "score": 88 },
  { "_id": 5, "name": "Eve", "age": 45, "city": "New York", "hobbies": ["gaming", "traveling"], "score": 92 }
]

三、常见聚合操作示例

1. 使用 $match 过滤文档

过滤出住在 New York 的用户:

javascript 复制代码
db.collection.aggregate([
  { $match: { city: "New York" } }
])

结果:

json 复制代码
[
  { "_id": 1, "name": "Alice", "age": 25, "city": "New York", "hobbies": ["reading", "hiking"], "score": 85 },
  { "_id": 3, "name": "Charlie", "age": 35, "city": "New York", "hobbies": ["hiking", "traveling"], "score": 75 },
  { "_id": 5, "name": "Eve", "age": 45, "city": "New York", "hobbies": ["gaming", "traveling"], "score": 92 }
]

2. 使用 $group 进行分组和聚合

按城市分组,并计算每个城市的用户平均年龄和总分:

javascript 复制代码
db.collection.aggregate([
  { $group: {
      _id: "$city",
      avgAge: { $avg: "$age" },
      totalScore: { $sum: "$score" }
  }}
])

结果:

json 复制代码
[
  { "_id": "New York", "avgAge": 35, "totalScore": 252 },
  { "_id": "San Francisco", "avgAge": 35, "totalScore": 178 }
]

3. 使用 $project 选择和重命名字段

选择用户的姓名和年龄,并多添加一个新字段 isAdult 表明用户是否成年(假设成年年龄为18岁):

javascript 复制代码
db.collection.aggregate([
  { $project: {
      _id: 0,
      name: 1,
      age: 1,
      isAdult: { $gte: ["$age", 18] }
  }}
])

结果:

json 复制代码
[
  { "name": "Alice", "age": 25, "isAdult": true },
  { "name": "Bob", "age": 30, "isAdult": true },
  { "name": "Charlie", "age": 35, "isAdult": true },
  { "name": "David", "age": 40, "isAdult": true },
  { "name": "Eve", "age": 45, "isAdult": true }
]

4. 使用 $sort 排序文档

按分数降序排序:

javascript 复制代码
db.collection.aggregate([
  { $sort: { score: -1 } }
])

结果:

json 复制代码
[
  { "_id": 5, "name": "Eve", "age": 45, "city": "New York", "hobbies": ["gaming", "traveling"], "score": 92 },
  { "_id": 2, "name": "Bob", "age": 30, "city": "San Francisco", "hobbies": ["cooking", "gaming"], "score": 90 },
  { "_id": 4, "name": "David", "age": 40, "city": "San Francisco", "hobbies": ["reading", "cooking"], "score": 88 },
  { "_id": 1, "name": "Alice", "age": 25, "city": "New York", "hobbies": ["reading", "hiking"], "score": 85 },
  { "_id": 3, "name": "Charlie", "age": 35, "city": "New York", "hobbies": ["hiking", "traveling"], "score": 75 }
]

5. 使用 $limit$skip 实现分页

获取分数最高的前 3 名用户:

javascript 复制代码
db.collection.aggregate([
  { $sort: { score: -1 } },
  { $limit: 3 }
])

结果:

json 复制代码
[
  { "_id": 5, "name": "Eve", "age": 45, "city": "New York", "hobbies": ["gaming", "traveling"], "score": 92 },
  { "_id": 2, "name": "Bob", "age": 30, "city": "San Francisco", "hobbies": ["cooking", "gaming"], "score": 90 },
  { "_id": 4, "name": "David", "age": 40, "city": "San Francisco", "hobbies": ["reading", "cooking"], "score": 88 }
]

跳过前 2 名用户,获取接下来的 2 名用户:

javascript 复制代码
db.collection.aggregate([
  { $sort: { score: -1 } },
  { $skip: 2 },
  { $limit: 2 }
])

结果:

json 复制代码
[
  { "_id": 4, "name": "David", "age": 40, "city": "San Francisco", "hobbies": ["reading", "cooking"], "score": 88 },
  { "_id": 1, "name": "Alice", "age": 25, "city": "New York", "hobbies": ["reading", "hiking"], "score": 85 }
]

6. 使用 $lookup 进行联表操作

假设有另一个名为 orders 的集合,包含用户的订单信息:

json 复制代码
[
  { "_id": 1, "userId": 1, "total": 100 },
  { "_id": 2, "userId": 2, "total": 200 },
  { "_id": 3, "userId": 1, "total": 150 },
  { "_id": 4, "userId": 3, "total": 250 }
]

使用 $lookup 将用户和订单信息联接起来:

javascript 复制代码
db.collection.aggregate([
  { $lookup: {
      from: "orders",
      localField: "_id",
      foreignField: "userId",
      as: "orders"
  }}
])

结果:

json 复制代码
[
  { "_id": 1, "name": "Alice", "age": 25, "city": "New York", "hobbies": ["reading", "hiking"], "score": 85, "orders": [{ "_id": 1, "userId": 1, "total": 100 }, { "_id": 3, "userId": 1, "total": 150 }] },
  { "_id": 2, "name": "Bob", "age": 30, "city": "San Francisco", "hobbies": ["cooking", "gaming"], "score": 90, "orders": [{ "_id": 2, "userId": 2, "total": 200 }] },
  { "_id": 3, "name": "Charlie", "age": 35, "city": "New York", "hobbies": ["hiking"]}
相关推荐
JustHappy6 小时前
古法编程秘籍(二):什么是代码模块化?别背概念,把房间收拾明白就够了
前端·后端
小江的记录本6 小时前
【JVM虚拟机】堆内存分代模型:年轻代(Eden+Survivor)、老年代、元空间Metaspace(附《思维导图》+《面试高频考点清单》)
java·前端·jvm·后端·python·spring·面试
IT_陈寒9 小时前
Python闭包里藏的这个坑,差点让我加班到凌晨
前端·人工智能·后端
IT_陈寒9 小时前
Java注解空指针?这个坑我踩得莫名其妙
前端·人工智能·后端
土狗TuGou10 小时前
SQL内功笔记 · 第8篇:事务的四大特性与隔离级别
数据库·笔记·后端·sql·mysql·oracle
ZengLiangYi10 小时前
React Query + REST API 最佳实践
javascript·后端·react.js
星浩AI10 小时前
项目实战:合同智能审批 · LangGraph + HITL 人机协同方案 [有源码]
后端·langchain·agent
JavaGuide10 小时前
Codex 接入第三方模型 DeepSeek、GLM、Kimi 教程:CC-Switch 和 Codex++ 两种方案对比
后端·ai编程
ZengLiangYi10 小时前
Fastify 加 Electron:把 Web 服务嵌进桌面应用
前端·javascript·后端
李白你好11 小时前
页面资产梳理 · 技术指纹识别 · Spring 端点探测
java·后端·spring