MongoDB聚合:$graphLookup

$graphLookup聚合阶段在一个集合中执行递归搜索,可以使用选项来控制递归搜索的深度和条件。

$graphLookup搜索过程总结如下:

  1. 输入文档进入$graphLookup聚合阶段。
  2. $graphLookup的搜索目标是from参数指定的集合(搜索参数的完整列表见下文)。
  3. 对于每个输入文档,搜索都从startWith指定的值开始。
  4. graphLookup使用startWith的值匹配由from指定的集合和connectToField指定的字段的值。
  5. 对于每个匹配文档,$graphLookupconnectFromField的值来检查每个from参数指定的集合下的connectToField参数指定的字段的值,然后将匹配上的from集合的文档放到由as参数指定的数组中。
    然后该步骤继续递归直到没有匹配的文档或操作达到由maxDepth参数指定的递归深度。然后$graphLookup把数组字段添加到输入文档。在完成所有的文档搜索后返回结果。

语法

js 复制代码
{
   $graphLookup: {
      from: <collection>,
      startWith: <expression>,
      connectFromField: <string>,
      connectToField: <string>,
      as: <string>,
      maxDepth: <number>,
      depthField: <string>,
      restrictSearchWithMatch: <document>
   }
}

参数字段解释:

字段 描述
from $graphLookup操作搜索的目标集合,递归匹配connectFromFieldconnnectToField字段的值,from指定的集合必须与当前集合在同一个数据库,并且不可以是同一个集合
startWith 可选,表达式,connectFromField字段进行递归搜索的起始值。startWith的值也可以是数组,其每个值都会被遍历处理
connectFromField 指定一个字段名,其值用于递归搜索匹配。与集合中其他文档connectToField相对应,如果其值是数组,则会在遍历时单独处理每个元素
connectToField 其他文档中的字段名称,用于匹配connectFromField参数指定的字段值
as 添加到每个输出文档中的数组字段名称。包含在$graphLook阶段遍历的所有文档(注意,数组元素的顺序不保证)
maxDepth 可选,正整数,指定最大的递归深度
depthField 可选,要添加到搜索路径中每个遍历文档的字段名称。该字段的值是文档的递归深度,长整数。递归深度值从零开始,因此第一次查找对应的深度为零
restrictSearchWithMatch 可选,文档类型。为递归搜索指定额外的条件,其语法与查询过滤语法相同。可以在过滤条件中使用所有的聚合表达式,如:{ lastName: { $ne: "$lastName" } },该表达式无法在该上下文中查找lastName值与输入文档的lastName值不同的文档,因为"$lastName"将充当字符串文本,而不是字段路径

使用

分片集合

从MongoDB 5.1开始,可以在from参数中指定分片集合,但不能在事务中使用分片集合。

最大递归深度

maxDepth字段设置为0相当于一个非递归的$graphLookup搜索阶段

内存

$graphLookup阶段有100M内存的限制,如果想突破这个限制,可以为聚合指定allowDiskUse: true,该设置也会影响到$graphLookup中使用的其他聚合阶段。

视图和集合

如果执行涉及多个视图的聚合,如使用$lookup$graphLookup,视图必须有相同的集合。

举例

单个集合

employees集合有下面的文档:

js 复制代码
{ "_id" : 1, "name" : "Dev" }
{ "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
{ "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
{ "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }
{ "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }
{ "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew" }

下面的$graphLookup递归匹配employees集合中reportsToname字段,返回每个人的报告层次结构:

js 复制代码
db.employees.aggregate( [
   {
      $graphLookup: {
         from: "employees",
         startWith: "$reportsTo",
         connectFromField: "reportsTo",
         connectToField: "name",
         as: "reportingHierarchy"
      }
   }
] )

操作返回下面的结果:

json 复制代码
{
   "_id" : 1,
   "name" : "Dev",
   "reportingHierarchy" : [ ]
}
{
   "_id" : 2,
   "name" : "Eliot",
   "reportsTo" : "Dev",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" }
   ]
}
{
   "_id" : 3,
   "name" : "Ron",
   "reportsTo" : "Eliot",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
   ]
}
{
   "_id" : 4,
   "name" : "Andrew",
   "reportsTo" : "Eliot",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
   ]
}
{
   "_id" : 5,
   "name" : "Asya",
   "reportsTo" : "Ron",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },
      { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
   ]
}
{
   "_id" : 6,
   "name" : "Dan",
   "reportsTo" : "Andrew",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },
      { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }
   ]
}

下表显示了文件的遍历路径:

{ "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }:

起始值 文档reportsTo的值
{ ... "reportsTo" : "Ron" }
深度0 { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
深度1 { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
深度2 { "_id" : 1, "name" : "Dev" }

输出结果生成的层次结构Asya -> Ron -> Eliot -> Dev

多个集合

$lookup类似,$graphLookup可以跨同一数据库的集合

例如,在同一数据库中分别创建两个集合:

  • airports集合有下列文档:
js 复制代码
db.airports.insertMany( [
   { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] },
   { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] },
   { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] },
   { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] },
   { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ] }
] )
  • travelers集合有以下文档:
js 复制代码
db.travelers.insertMany( [
   { "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK" },
   { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK" },
   { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS" }
] )

对于travelers集合中的每个文档,下面的聚合操作会查找airports集合中nearestAirport的值,并递归匹配connects字段和airport字段。该操作指定的最大递归深度为2

js 复制代码
db.travelers.aggregate( [
   {
      $graphLookup: {
         from: "airports",
         startWith: "$nearestAirport",
         connectFromField: "connects",
         connectToField: "airport",
         maxDepth: 2,
         depthField: "numConnections",
         as: "destinations"
      }
   }
] )

操作返回下面的结果:

json 复制代码
{
   "_id" : 1,
   "name" : "Dev",
   "nearestAirport" : "JFK",
   "destinations" : [
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(0) }
   ]
}
{
   "_id" : 2,
   "name" : "Eliot",
   "nearestAirport" : "JFK",
   "destinations" : [
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(0) } ]
}
{
   "_id" : 3,
   "name" : "Jeff",
   "nearestAirport" : "BOS",
   "destinations" : [
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 4,
        "airport" : "LHR",
        "connects" : [ "PWM" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(0) }
   ]
}

下表显示了递归搜索遍历的路径,最大深度为2,开始的airportJFK

开始值 travelers集合中nearestAirport的值
{ ... "nearestAirport" : "JFK" }
深度0 { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] }
深度1 { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] }
深度2 { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] }

查询条件

下面的示例使用了一个包含一组文档的集合,文档中包含人名及其朋友和爱好的数组。聚合操作会找到一个特定的人,并遍历她的社交网络,找到爱好为golf的人。

集合people包含了下列文档:

json 复制代码
{
  "_id" : 1,
  "name" : "Tanya Jordan",
  "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ],
  "hobbies" : [ "tennis", "unicycling", "golf" ]
}
{
  "_id" : 2,
  "name" : "Carole Hale",
  "friends" : [ "Joseph Dennis", "Tanya Jordan", "Terry Hawkins" ],
  "hobbies" : [ "archery", "golf", "woodworking" ]
}
{
  "_id" : 3,
  "name" : "Terry Hawkins",
  "friends" : [ "Tanya Jordan", "Carole Hale", "Angelo Ward" ],
  "hobbies" : [ "knitting", "frisbee" ]
}
{
  "_id" : 4,
  "name" : "Joseph Dennis",
  "friends" : [ "Angelo Ward", "Carole Hale" ],
  "hobbies" : [ "tennis", "golf", "topiary" ]
}
{
  "_id" : 5,
  "name" : "Angelo Ward",
  "friends" : [ "Terry Hawkins", "Shirley Soto", "Joseph Dennis" ],
  "hobbies" : [ "travel", "ceramics", "golf" ]
}
{
   "_id" : 6,
   "name" : "Shirley Soto",
   "friends" : [ "Angelo Ward", "Tanya Jordan", "Carole Hale" ],
   "hobbies" : [ "frisbee", "set theory" ]
 }

下面的聚合操作使用了3个阶段:

  • $match匹配name字段包含字符串"Tanya Jordan"的文档,返回一个输出文档。

  • $graphLookup将输出文档的friends字段与集合中其他文档的name字段连接起来,以遍历Tanya Jordan的社交网络。此阶段使用restrictSearchWithMatch参数只查找爱好数组中包含golf的文档。返回一个输出文档。

  • $project 塑造输出文档。列出的connections who play golf的名字取自输入文档的golfers数组。

js 复制代码
db.people.aggregate( [
  { $match: { "name": "Tanya Jordan" } },
  { $graphLookup: {
      from: "people",
      startWith: "$friends",
      connectFromField: "friends",
      connectToField: "name",
      as: "golfers",
      restrictSearchWithMatch: { "hobbies" : "golf" }
    }
  },
  { $project: {
      "name": 1,
      "friends": 1,
      "connections who play golf": "$golfers.name"
    }
  }
] )

操作返回下面的文档:

json 复制代码
{
   "_id" : 1,
   "name" : "Tanya Jordan",
   "friends" : [
      "Shirley Soto",
      "Terry Hawkins",
      "Carole Hale"
   ],
   "connections who play golf" : [
      "Joseph Dennis",
      "Tanya Jordan",
      "Angelo Ward",
      "Carole Hale"
   ]
}
相关推荐
藓类少女1 小时前
正则表达式
数据库·python·mysql·正则表达式
魏 无羡2 小时前
pgsql 分组查询方法
java·服务器·数据库
szcsd1234567892 小时前
简单有效关于msvcp140.dll丢失的解决方法,msvcp140.dll修复的方法原理及步骤
数据库·dll文件·dll修复工具·dll修复·dll丢失
江凡心2 小时前
Qt 每日面试题 -1
服务器·数据库·qt
好记忆不如烂笔头abc2 小时前
db2恢复数据库
数据库
Counter-Strike大牛3 小时前
MySQL迁移达梦报错,DMException: 第1 行附近出现错误: 无效的表或视图名[ACT_GE_PROPERTY]
java·数据库
小诸葛的博客5 小时前
pg入门18—如何使用pg gis
数据库
林太白5 小时前
❤Node09-用户信息token认证
数据库·后端·mysql·node.js
我爱娃哈哈5 小时前
MySQL 优化器:理解与探秘
数据库·mysql
尘浮生5 小时前
Java项目实战II基于Java+Spring Boot+MySQL的大型商场应急预案管理系统(源码+数据库+文档)
java·开发语言·数据库·spring boot·spring·maven·intellij-idea