MongoDB聚合:$graphLookup

$graphLookup聚合阶段在一个集合中执行递归搜索,可以使用选项来控制递归搜索的深度和条件。

$graphLookup搜索过程总结如下:

  1. 输入文档进入$graphLookup聚合阶段。
  2. $graphLookup的搜索目标是from参数指定的集合(搜索参数的完整列表见下文)。
  3. 对于每个输入文档,搜索都从startWith指定的值开始。
  4. graphLookup使用startWith的值匹配由from指定的集合和connectToField指定的字段的值。
  5. 对于每个匹配文档,$graphLookupconnectFromField的值来检查每个from参数指定的集合下的connectToField参数指定的字段的值,然后将匹配上的from集合的文档放到由as参数指定的数组中。
    然后该步骤继续递归直到没有匹配的文档或操作达到由maxDepth参数指定的递归深度。然后$graphLookup把数组字段添加到输入文档。在完成所有的文档搜索后返回结果。

语法

js 复制代码
{
   $graphLookup: {
      from: <collection>,
      startWith: <expression>,
      connectFromField: <string>,
      connectToField: <string>,
      as: <string>,
      maxDepth: <number>,
      depthField: <string>,
      restrictSearchWithMatch: <document>
   }
}

参数字段解释:

字段 描述
from $graphLookup操作搜索的目标集合,递归匹配connectFromFieldconnnectToField字段的值,from指定的集合必须与当前集合在同一个数据库,并且不可以是同一个集合
startWith 可选,表达式,connectFromField字段进行递归搜索的起始值。startWith的值也可以是数组,其每个值都会被遍历处理
connectFromField 指定一个字段名,其值用于递归搜索匹配。与集合中其他文档connectToField相对应,如果其值是数组,则会在遍历时单独处理每个元素
connectToField 其他文档中的字段名称,用于匹配connectFromField参数指定的字段值
as 添加到每个输出文档中的数组字段名称。包含在$graphLook阶段遍历的所有文档(注意,数组元素的顺序不保证)
maxDepth 可选,正整数,指定最大的递归深度
depthField 可选,要添加到搜索路径中每个遍历文档的字段名称。该字段的值是文档的递归深度,长整数。递归深度值从零开始,因此第一次查找对应的深度为零
restrictSearchWithMatch 可选,文档类型。为递归搜索指定额外的条件,其语法与查询过滤语法相同。可以在过滤条件中使用所有的聚合表达式,如:{ lastName: { $ne: "$lastName" } },该表达式无法在该上下文中查找lastName值与输入文档的lastName值不同的文档,因为"$lastName"将充当字符串文本,而不是字段路径

使用

分片集合

从MongoDB 5.1开始,可以在from参数中指定分片集合,但不能在事务中使用分片集合。

最大递归深度

maxDepth字段设置为0相当于一个非递归的$graphLookup搜索阶段

内存

$graphLookup阶段有100M内存的限制,如果想突破这个限制,可以为聚合指定allowDiskUse: true,该设置也会影响到$graphLookup中使用的其他聚合阶段。

视图和集合

如果执行涉及多个视图的聚合,如使用$lookup$graphLookup,视图必须有相同的集合。

举例

单个集合

employees集合有下面的文档:

js 复制代码
{ "_id" : 1, "name" : "Dev" }
{ "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
{ "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
{ "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }
{ "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }
{ "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew" }

下面的$graphLookup递归匹配employees集合中reportsToname字段,返回每个人的报告层次结构:

js 复制代码
db.employees.aggregate( [
   {
      $graphLookup: {
         from: "employees",
         startWith: "$reportsTo",
         connectFromField: "reportsTo",
         connectToField: "name",
         as: "reportingHierarchy"
      }
   }
] )

操作返回下面的结果:

json 复制代码
{
   "_id" : 1,
   "name" : "Dev",
   "reportingHierarchy" : [ ]
}
{
   "_id" : 2,
   "name" : "Eliot",
   "reportsTo" : "Dev",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" }
   ]
}
{
   "_id" : 3,
   "name" : "Ron",
   "reportsTo" : "Eliot",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
   ]
}
{
   "_id" : 4,
   "name" : "Andrew",
   "reportsTo" : "Eliot",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
   ]
}
{
   "_id" : 5,
   "name" : "Asya",
   "reportsTo" : "Ron",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },
      { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
   ]
}
{
   "_id" : 6,
   "name" : "Dan",
   "reportsTo" : "Andrew",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },
      { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }
   ]
}

下表显示了文件的遍历路径:

{ "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }:

起始值 文档reportsTo的值
{ ... "reportsTo" : "Ron" }
深度0 { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
深度1 { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
深度2 { "_id" : 1, "name" : "Dev" }

输出结果生成的层次结构Asya -> Ron -> Eliot -> Dev

多个集合

$lookup类似,$graphLookup可以跨同一数据库的集合

例如,在同一数据库中分别创建两个集合:

  • airports集合有下列文档:
js 复制代码
db.airports.insertMany( [
   { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] },
   { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] },
   { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] },
   { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] },
   { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ] }
] )
  • travelers集合有以下文档:
js 复制代码
db.travelers.insertMany( [
   { "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK" },
   { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK" },
   { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS" }
] )

对于travelers集合中的每个文档,下面的聚合操作会查找airports集合中nearestAirport的值,并递归匹配connects字段和airport字段。该操作指定的最大递归深度为2

js 复制代码
db.travelers.aggregate( [
   {
      $graphLookup: {
         from: "airports",
         startWith: "$nearestAirport",
         connectFromField: "connects",
         connectToField: "airport",
         maxDepth: 2,
         depthField: "numConnections",
         as: "destinations"
      }
   }
] )

操作返回下面的结果:

json 复制代码
{
   "_id" : 1,
   "name" : "Dev",
   "nearestAirport" : "JFK",
   "destinations" : [
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(0) }
   ]
}
{
   "_id" : 2,
   "name" : "Eliot",
   "nearestAirport" : "JFK",
   "destinations" : [
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(0) } ]
}
{
   "_id" : 3,
   "name" : "Jeff",
   "nearestAirport" : "BOS",
   "destinations" : [
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 4,
        "airport" : "LHR",
        "connects" : [ "PWM" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(0) }
   ]
}

下表显示了递归搜索遍历的路径,最大深度为2,开始的airportJFK

开始值 travelers集合中nearestAirport的值
{ ... "nearestAirport" : "JFK" }
深度0 { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] }
深度1 { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] }
深度2 { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] }

查询条件

下面的示例使用了一个包含一组文档的集合,文档中包含人名及其朋友和爱好的数组。聚合操作会找到一个特定的人,并遍历她的社交网络,找到爱好为golf的人。

集合people包含了下列文档:

json 复制代码
{
  "_id" : 1,
  "name" : "Tanya Jordan",
  "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ],
  "hobbies" : [ "tennis", "unicycling", "golf" ]
}
{
  "_id" : 2,
  "name" : "Carole Hale",
  "friends" : [ "Joseph Dennis", "Tanya Jordan", "Terry Hawkins" ],
  "hobbies" : [ "archery", "golf", "woodworking" ]
}
{
  "_id" : 3,
  "name" : "Terry Hawkins",
  "friends" : [ "Tanya Jordan", "Carole Hale", "Angelo Ward" ],
  "hobbies" : [ "knitting", "frisbee" ]
}
{
  "_id" : 4,
  "name" : "Joseph Dennis",
  "friends" : [ "Angelo Ward", "Carole Hale" ],
  "hobbies" : [ "tennis", "golf", "topiary" ]
}
{
  "_id" : 5,
  "name" : "Angelo Ward",
  "friends" : [ "Terry Hawkins", "Shirley Soto", "Joseph Dennis" ],
  "hobbies" : [ "travel", "ceramics", "golf" ]
}
{
   "_id" : 6,
   "name" : "Shirley Soto",
   "friends" : [ "Angelo Ward", "Tanya Jordan", "Carole Hale" ],
   "hobbies" : [ "frisbee", "set theory" ]
 }

下面的聚合操作使用了3个阶段:

  • $match匹配name字段包含字符串"Tanya Jordan"的文档,返回一个输出文档。

  • $graphLookup将输出文档的friends字段与集合中其他文档的name字段连接起来,以遍历Tanya Jordan的社交网络。此阶段使用restrictSearchWithMatch参数只查找爱好数组中包含golf的文档。返回一个输出文档。

  • $project 塑造输出文档。列出的connections who play golf的名字取自输入文档的golfers数组。

js 复制代码
db.people.aggregate( [
  { $match: { "name": "Tanya Jordan" } },
  { $graphLookup: {
      from: "people",
      startWith: "$friends",
      connectFromField: "friends",
      connectToField: "name",
      as: "golfers",
      restrictSearchWithMatch: { "hobbies" : "golf" }
    }
  },
  { $project: {
      "name": 1,
      "friends": 1,
      "connections who play golf": "$golfers.name"
    }
  }
] )

操作返回下面的文档:

json 复制代码
{
   "_id" : 1,
   "name" : "Tanya Jordan",
   "friends" : [
      "Shirley Soto",
      "Terry Hawkins",
      "Carole Hale"
   ],
   "connections who play golf" : [
      "Joseph Dennis",
      "Tanya Jordan",
      "Angelo Ward",
      "Carole Hale"
   ]
}
相关推荐
王会举1 小时前
让SQL飞起来:搭建企业AI应用的SQL性能优化实战
数据库·人工智能·ai·性能优化
bing_1581 小时前
在 Spring Boot 项目中,如何进行高效的数据库 Schema 设计?
数据库·spring boot·后端·数据库schema设计
听雪楼主.2 小时前
Oracle补丁安装工具opatch更新报错处理
数据库·oracle
不吃元西2 小时前
对于客户端数据存储方案——SQLite的思考
数据库·sqlite
rgb0f02 小时前
MySQL视图相关
数据库·mysql·oracle
编程、小哥哥2 小时前
oracle值sql记录
数据库·sql·oracle
三千花灯2 小时前
jmeter提取返回值到文件
数据库·jmeter
萧离1952 小时前
超细的Linux安装minio教学
数据库
小吕学编程2 小时前
基于Canal+Spring Boot+Kafka的MySQL数据变更实时监听实战指南
数据库·后端·mysql·spring·kafka
一个小白5552 小时前
Linux,redis群集模式,主从复制,读写分离
linux·运维·数据库·centos