MongoDB - 聚合阶段 $match、$sort、$limit

文章目录

- [1. $match 聚合阶段](#1.$ match 聚合阶段)
- - [1. 构造测试数据](#1. 构造测试数据)
  - [2. $match 示例](#2.$ match 示例)
  - [3. $match 示例](#3.$ match 示例)
- [2. $sort 聚合阶段](#2.$ sort 聚合阶段)
- - [1. 排序一致性问题](#1. 排序一致性问题)
  - [2. $sort 示例](#2.$ sort 示例)
- [3. $limit 聚合阶段](#3.$ limit 聚合阶段)

1. $match 聚合阶段

$match 接受一个指定查询条件的文档。

$match 阶段语法：

复制代码

{ $match: { <query> } }

$match 查询语法与读取操作查询语法相同，即 $match 不接受原始聚合表达式。要在 $match 中包含聚合表达式，请使用 $expr 查询表达式：

复制代码

{ $match: { $expr: { <aggregation expression> } } }

尽可能早地将 $match 放在聚合管道中，由于 $match 限制了聚合管道中的文档总数，因此早期的 $match 操作会最大限度地减少管道中的处理量。

1. 构造测试数据

复制代码

db.articles.drop()

db.articles.insertMany([
    {
        "_id": ObjectId("512bc95fe835e68f199c8686"),
        "author": "dave",
        "score": 80,
        "views": 100
    },
    {
        "_id": ObjectId("512bc962e835e68f199c8687"),
        "author": "dave",
        "score": 85,
        "views": 521
    },
    {
        "_id": ObjectId("55f5a192d4bede9ac365b257"),
        "author": "ahn",
        "score": 60,
        "views": 1000
    },
    {
        "_id": ObjectId("55f5a192d4bede9ac365b258"),
        "author": "li",
        "score": 55,
        "views": 5000
    },
    {
        "_id": ObjectId("55f5a1d3d4bede9ac365b259"),
        "author": "annT",
        "score": 60,
        "views": 50
    },
    {
        "_id": ObjectId("55f5a1d3d4bede9ac365b25a"),
        "author": "li",
        "score": 94,
        "views": 999
    },
    {
        "_id": ObjectId("55f5a1d3d4bede9ac365b25b"),
        "author": "ty",
        "score": 95,
        "views": 1000
    }
])

2. $match 示例

使用 $match 来执行简易等值匹配，$match 会选择 author 字段等于 dave 的文档，而聚合返回以下内容：

复制代码

db.articles.aggregate(
    [ { $match : { author : "dave" } } ]
);

// 1
{
    "_id": ObjectId("512bc95fe835e68f199c8686"),
    "author": "dave",
    "score": 80,
    "views": 100
}

// 2
{
    "_id": ObjectId("512bc962e835e68f199c8687"),
    "author": "dave",
    "score": 85,
    "views": 521
}

SpringBoot 整合 MongoDB 实现：

java 复制代码

@SpringBootTest
@RunWith(SpringRunner.class)
public class BeanLoadServiceTest {

    @Autowired
    private MongoTemplate mongoTemplate;

    @Test
    public void aggregateTest() {
        // $match阶段
        Criteria criteria = Criteria.where("author").is("dave");
        MatchOperation match = Aggregation.match(criteria);

        Aggregation aggregation = Aggregation.newAggregation(match);

        // 执行聚合管道操作
        AggregationResults<Article> results
                = mongoTemplate.aggregate(aggregation, Article.class, Article.class);
        List<Article> mappedResults = results.getMappedResults();

        // 打印结果
        mappedResults.forEach(System.out::println);
        //Article(id=512bc95fe835e68f199c8686, author=dave, score=80, views=100)
        //Article(id=512bc962e835e68f199c8687, author=dave, score=85, views=521)
    }
}

这里输出文档直接使用了Article.class，可以重新定义实体类接收输出文档的字段：

java 复制代码

@Data
public class AggregationResult {
    @Id
    private String id;
    private String author;
    private int score;
    private int views;
}

java 复制代码

@SpringBootTest
@RunWith(SpringRunner.class)
public class BeanLoadServiceTest {

    @Autowired
    private MongoTemplate mongoTemplate;

    @Test
    public void aggregateTest() {
        // $match阶段
        Criteria criteria = Criteria.where("author").is("dave");
        MatchOperation match = Aggregation.match(criteria);

        Aggregation aggregation = Aggregation.newAggregation(match);

        // 执行聚合管道操作
        AggregationResults<AggregationResult> results
                = mongoTemplate.aggregate(aggregation, Article.class, AggregationResult.class);
        List<AggregationResult> mappedResults = results.getMappedResults();

        // 打印结果
        mappedResults.forEach(System.out::println);
        //AggregationResult(id=512bc95fe835e68f199c8686, author=dave, score=80, views=100)
        //AggregationResult(id=512bc962e835e68f199c8687, author=dave, score=85, views=521)
    }
}

3. $match 示例

使用 $match 管道操作符选择要处理的文档，然后将结果导入到 $group 管道操作符，以计算文档的数量：

复制代码

db.articles.aggregate( [
  // 第一阶段
  { $match: { $or: [ { score: { $gt: 70, $lt: 90 } }, { views: { $gte: 1000 } } ] } },
  // 第二阶段
  { $group: { _id: null, count: { $sum: 1 } } }
] );

第一阶段：

$match 阶段选择 score 大于 70 但小于 90 或 views 大于或等于 1000 的文档。

第二阶段：

将 m a t c h 阶段筛选的文档通过管道传送到 ' match 阶段筛选的文档通过管道传送到 ` match阶段筛选的文档通过管道传送到'group` 阶段进行计数。

复制代码

// 1
{
    "_id": null,
    "count": 5
}

SpringBoot 整合 MongoDB 实现：

java 复制代码

@Data
@Document(collection = "articles")
public class Article {
    @Id
    private String id;
    private String author;
    private int score;
    private int views;
}

java 复制代码

@Data
public class AggregationResult {
    private String id;
    private Integer count;
}

java 复制代码

@SpringBootTest
@RunWith(SpringRunner.class)
public class BeanLoadServiceTest {

    @Autowired
    private MongoTemplate mongoTemplate;

    @Test
    public void aggregateTest() {
        // 第一阶段
        Criteria criteria = new Criteria();
        criteria.orOperator(Criteria.where("score").gt(70).lt(90), Criteria.where("views").gte(1000));
        MatchOperation match = Aggregation.match(criteria);

        // 第二阶段
        GroupOperation group = Aggregation.group().count().as("count");

        // 组合上面的2个阶段
        Aggregation aggregation = Aggregation.newAggregation(match,group);

        // 执行聚合管道操作
        AggregationResults<AggregationResult> results
                = mongoTemplate.aggregate(aggregation, Article.class, AggregationResult.class);
        List<AggregationResult> mappedResults = results.getMappedResults();

        // 打印结果
        mappedResults.forEach(System.out::println);
        // AggregationResult(id=null, count=5)
    }
}

2. $sort 聚合阶段

$sort 将所有输入文档进行排序，然后按照排序将其返回至管道。

复制代码

{ $sort: { <field1>: <sort order>, <field2>: <sort order> ... } }

$sort 接受排序依据的字段及相应排序顺序的文档。当 sort order=1时升序排序，sort order=-1 降序排序。

如果对多个字段进行排序，则按从左到右的顺序进行排序。例如，在上面的表单中，文档首先按 field1 排序。然后，具有相同 field1 值的文档将按 field2 进一步排序。

1. 排序一致性问题

MongoDB 不按特定顺序将文档存储在集合中。对包含重复值的字段进行排序时，可能会以任何顺序返回包含这些值的文档。如果需要一致的排序顺序，请在排序中至少纳入一个包含唯一值的字段。

复制代码

db.restaurants.drop()

db.restaurants.insertMany( [
   { "_id" : 1, "name" : "Central Park Cafe", "borough" : "Manhattan"},
   { "_id" : 2, "name" : "Rock A Feller Bar and Grill", "borough" : "Queens"},
   { "_id" : 3, "name" : "Empire State Pub", "borough" : "Brooklyn"},
   { "_id" : 4, "name" : "Stan's Pizzaria", "borough" : "Manhattan"},
   { "_id" : 5, "name" : "Jane's Deli", "borough" : "Brooklyn"},
] )

以下命令使用 $sort 阶段对 borough 字段进行排序：

复制代码

db.restaurants.aggregate(
   [
     { $sort : { borough : 1 } }
   ]
)

在此示例中，排序顺序可能不一致，因为 borough 字段包含 Manhattan 和 Brooklyn 的重复值。文档按 borough 的字母顺序返回，但具有 borough 重复值的文档的顺序在多次执行同一排序中可能不相同。

要实现一致的排序，可以在排序中添加一个仅包含唯一值的字段。以下命令使用 $sort 阶段对 borough 字段和 _id 字段进行排序：

复制代码

db.restaurants.aggregate(
   [
     { $sort : { borough : 1, _id: 1 } }
   ]
)

由于 _id 字段始终保证包含唯一值，因此在同一排序的多次执行中返回的排序顺序将始终相同。

2. $sort 示例

复制代码

db.articles.drop()

db.articles.insertMany([
    {
        "_id": ObjectId("512bc95fe835e68f199c8686"),
        "author": "dave",
        "score": 80,
        "views": 100
    },
    {
        "_id": ObjectId("512bc962e835e68f199c8687"),
        "author": "dave",
        "score": 85,
        "views": 521
    },
    {
        "_id": ObjectId("55f5a192d4bede9ac365b257"),
        "author": "ahn",
        "score": 60,
        "views": 1000
    },
    {
        "_id": ObjectId("55f5a192d4bede9ac365b258"),
        "author": "li",
        "score": 55,
        "views": 5000
    },
    {
        "_id": ObjectId("55f5a1d3d4bede9ac365b259"),
        "author": "annT",
        "score": 55,
        "views": 50
    },
    {
        "_id": ObjectId("55f5a1d3d4bede9ac365b25a"),
        "author": "li",
        "score": 94,
        "views": 999
    },
    {
        "_id": ObjectId("55f5a1d3d4bede9ac365b25b"),
        "author": "ty",
        "score": 95,
        "views": 1000
    }
])

对于要作为排序依据的一个或多个字段，可以将排序顺序设置为 1 或 -1 以分别指定升序或降序。

复制代码

db.articles.aggregate(
   [
     { $sort : { score : -1, views: 1 } }
   ]
)

// 1
{
    "_id": ObjectId("55f5a1d3d4bede9ac365b25b"),
    "author": "ty",
    "score": 95,
    "views": 1000
}

// 2
{
    "_id": ObjectId("55f5a1d3d4bede9ac365b25a"),
    "author": "li",
    "score": 94,
    "views": 999
}

// 3
{
    "_id": ObjectId("512bc962e835e68f199c8687"),
    "author": "dave",
    "score": 85,
    "views": 521
}

// 4
{
    "_id": ObjectId("512bc95fe835e68f199c8686"),
    "author": "dave",
    "score": 80,
    "views": 100
}

// 5
{
    "_id": ObjectId("55f5a192d4bede9ac365b257"),
    "author": "ahn",
    "score": 60,
    "views": 1000
}

// 6
{
    "_id": ObjectId("55f5a1d3d4bede9ac365b259"),
    "author": "annT",
    "score": 55,
    "views": 50
}

// 7
{
    "_id": ObjectId("55f5a192d4bede9ac365b258"),
    "author": "li",
    "score": 55,
    "views": 5000
}

SpringBoot 整合 MongoDB：

java 复制代码

@Data
@Document(collection = "articles")
public class Article {
    @Id
    private String id;
    private String author;
    private int score;
    private int views;
}

@Data
public class AggregationResult {
    @Id
    private String id;
    private String author;
    private int score;
    private int views;
}

java 复制代码

@SpringBootTest
@RunWith(SpringRunner.class)
public class BeanLoadServiceTest {

    @Autowired
    private MongoTemplate mongoTemplate;

    @Test
    public void aggregateTest() {
        // $sort阶段
        SortOperation sortOperation = Aggregation
                .sort(Sort.by(Sort.Direction.DESC, "score"))
                .and(Sort.by(Sort.Direction.ASC, "views"));

        Aggregation aggregation = Aggregation.newAggregation(sortOperation);

        // 执行聚合查询
        AggregationResults<AggregationResult> results
                = mongoTemplate.aggregate(aggregation, Article.class, AggregationResult.class);
        List<AggregationResult> mappedResults = results.getMappedResults();

        // 打印结果
        mappedResults.forEach(System.out::println);
        //AggregationResult(id=55f5a1d3d4bede9ac365b25b, author=ty, score=95, views=1000)
        //AggregationResult(id=55f5a1d3d4bede9ac365b25a, author=li, score=94, views=999)
        //AggregationResult(id=512bc962e835e68f199c8687, author=dave, score=85, views=521)
        //AggregationResult(id=512bc95fe835e68f199c8686, author=dave, score=80, views=100)
        //AggregationResult(id=55f5a192d4bede9ac365b257, author=ahn, score=60, views=1000)
        //AggregationResult(id=55f5a1d3d4bede9ac365b259, author=annT, score=55, views=50)
        //AggregationResult(id=55f5a192d4bede9ac365b258, author=li, score=55, views=5000)
    }
}

3. $limit 聚合阶段

$limit 聚合阶段限制传递至管道。$limit 取一个正整数，用于指定传递的最大文档数量：

复制代码

{ $limit: <positive 64-bit integer> }

如果将 $limit 阶段与以下任何一项一起使用：

$sort 聚合阶段
sort() 方法
sort 命令

在将结果传递到$limit阶段之前，请务必在排序中至少包含一个包含唯一值的字段。

复制代码

db.articles.drop()

db.articles.insertMany([
    {
        "_id": ObjectId("512bc95fe835e68f199c8686"),
        "author": "dave",
        "score": 80,
        "views": 100
    },
    {
        "_id": ObjectId("512bc962e835e68f199c8687"),
        "author": "dave",
        "score": 85,
        "views": 521
    },
    {
        "_id": ObjectId("55f5a192d4bede9ac365b257"),
        "author": "ahn",
        "score": 60,
        "views": 1000
    },
    {
        "_id": ObjectId("55f5a192d4bede9ac365b258"),
        "author": "li",
        "score": 55,
        "views": 5000
    },
    {
        "_id": ObjectId("55f5a1d3d4bede9ac365b259"),
        "author": "annT",
        "score": 60,
        "views": 50
    },
    {
        "_id": ObjectId("55f5a1d3d4bede9ac365b25a"),
        "author": "li",
        "score": 94,
        "views": 999
    },
    {
        "_id": ObjectId("55f5a1d3d4bede9ac365b25b"),
        "author": "ty",
        "score": 95,
        "views": 1000
    }
])

db.articles.aggregate([
   { $limit : 5 }
]);

// 1
{
    "_id": ObjectId("512bc95fe835e68f199c8686"),
    "author": "dave",
    "score": 80,
    "views": 100
}

// 2
{
    "_id": ObjectId("512bc962e835e68f199c8687"),
    "author": "dave",
    "score": 85,
    "views": 521
}

// 3
{
    "_id": ObjectId("55f5a192d4bede9ac365b257"),
    "author": "ahn",
    "score": 60,
    "views": 1000
}

// 4
{
    "_id": ObjectId("55f5a192d4bede9ac365b258"),
    "author": "li",
    "score": 55,
    "views": 5000
}

// 5
{
    "_id": ObjectId("55f5a1d3d4bede9ac365b259"),
    "author": "annT",
    "score": 55,
    "views": 50
}

SpringBoot 整合 MongoDB：

java 复制代码

@Data
@Document(collection = "articles")
public class Article {
    @Id
    private String id;
    private String author;
    private int score;
    private int views;
}

@Data
public class AggregationResult {
    @Id
    private String id;
    private String author;
    private int score;
    private int views;
}

java 复制代码

@SpringBootTest
@RunWith(SpringRunner.class)
public class BeanLoadServiceTest {

    @Autowired
    private MongoTemplate mongoTemplate;

    @Test
    public void aggregateTest() {
        // $sort阶段
        LimitOperation limitOperation = Aggregation.limit(5);

        Aggregation aggregation = Aggregation.newAggregation(limitOperation);

        // 执行聚合查询
        AggregationResults<AggregationResult> results
                = mongoTemplate.aggregate(aggregation, Article.class, AggregationResult.class);
        List<AggregationResult> mappedResults = results.getMappedResults();

        // 打印结果
        mappedResults.forEach(System.out::println);
        //AggregationResult(id=512bc95fe835e68f199c8686, author=dave, score=80, views=100)
        //AggregationResult(id=512bc962e835e68f199c8687, author=dave, score=85, views=521)
        //AggregationResult(id=55f5a192d4bede9ac365b257, author=ahn, score=60, views=1000)
        //AggregationResult(id=55f5a192d4bede9ac365b258, author=li, score=55, views=5000)
        //AggregationResult(id=55f5a1d3d4bede9ac365b259, author=annT, score=55, views=50)
    }
}