测试Oracle-崖山-DM-GaussDB树形查询性能

TEST01数据来自DBA_OBJECTS,重复512次,共4500W行,创建2个索引

复制代码
create index idx_test01_n1 on test01(object_id,data_object_id);
create index idx_test01_n2 on test01(data_object_id,object_id);

测试SQL

复制代码
select *
  from test01 where object_id in(2,3)
 start with object_id = 2
connect by nocycle prior object_id = data_object_id;

select count(*)
  from test01
 start with object_id = 2
connect by nocycle prior object_id = data_object_id;

注意:生产环境不会有这样的大量重复数据,object_id和data_object_id也不是层级关系,严格来说我这个测试不严谨(虽然不严谨,但是能反映问题),主要是数据不好构造,见谅

Oracle19c

SQL1

复制代码
SQL> select *
  2    from test01 where object_id in(2,3)
  3   start with object_id = 2
  4  connect by nocycle prior object_id = data_object_id;

512 rows selected.

Elapsed: 00:00:01.98

Execution Plan
----------------------------------------------------------
Plan hash value: 3708411942

-------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |               |   261K|    51M|   180K  (4)| 00:00:08 |
|*  1 |  FILTER                               |               |       |       |            |          |
|*  2 |   CONNECT BY WITH FILTERING           |               |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| TEST01        |   505 | 49490 |   510   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                  | IDX_TEST01_N1 |   505 |       |     4   (0)| 00:00:01 |
|*  5 |    HASH JOIN                          |               |   260K|    27M|   173K  (1)| 00:00:07 |
|   6 |     CONNECT BY PUMP                   |               |       |       |            |          |
|*  7 |     TABLE ACCESS FULL                 | TEST01        |  4696K|   438M|   172K  (1)| 00:00:07 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TEST01"."OBJECT_ID"=2 OR "TEST01"."OBJECT_ID"=3)
   2 - access("TEST01"."DATA_OBJECT_ID"=PRIOR "TEST01"."OBJECT_ID")
   4 - access("OBJECT_ID"=2)
   5 - access("connect$_by$_pump$_002"."prior object_id "="DATA_OBJECT_ID")
   7 - filter("DATA_OBJECT_ID" IS NOT NULL)

Note
-----
   - this is an adaptive plan


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      19032  consistent gets
          0  physical reads
          0  redo size
       7595  bytes sent via SQL*Net to client
        820  bytes received via SQL*Net from client
         36  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
        512  rows processed

SQL2

复制代码
SQL> select count(*)
  2    from test01
  3   start with object_id = 2
  4  connect by nocycle prior object_id = data_object_id;

Elapsed: 00:00:01.34

Execution Plan
----------------------------------------------------------
Plan hash value: 3365808593

--------------------------------------------------------------------------------------------
| Id  | Operation                  | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |               |     1 |    26 |  8172  (20)| 00:00:01 |
|   1 |  SORT AGGREGATE            |               |     1 |    26 |            |          |
|*  2 |   CONNECT BY WITH FILTERING|               |       |       |            |          |
|*  3 |    INDEX RANGE SCAN        | IDX_TEST01_N1 |   505 |  3535 |     4   (0)| 00:00:01 |
|   4 |    NESTED LOOPS            |               |   260K|  5088K|  6585   (1)| 00:00:01 |
|   5 |     CONNECT BY PUMP        |               |       |       |            |          |
|*  6 |     INDEX RANGE SCAN       | IDX_TEST01_N2 |   516 |  3612 |    13   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DATA_OBJECT_ID"=PRIOR "OBJECT_ID")
   3 - access("OBJECT_ID"=2)
   6 - access("connect$_by$_pump$_002"."prior object_id "="DATA_OBJECT_ID")
       filter("DATA_OBJECT_ID" IS NOT NULL)

Note
-----
   - this is an adaptive plan


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         88  consistent gets
          0  physical reads
          0  redo size
        362  bytes sent via SQL*Net to client
        429  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          1  rows processed

两条SQL默认走的是CONNECT_BY_FILTERING(/*+ CONNECT_BY_FILTERING /)
测试NO_CONNECT_BY_FILTERING(/
+ NO_CONNECT_BY_FILTERING */)看看速度

SQL1

复制代码
SQL> select /*+ NO_CONNECT_BY_FILTERING */ *
  2    from test01 where object_id in(2,3)
  3   start with object_id = 2
  4  connect by nocycle prior object_id = data_object_id;

512 rows selected.

Elapsed: 00:01:14.85

Execution Plan
----------------------------------------------------------
Plan hash value: 1234393137

---------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |        |   261K|    51M|  1166K (86)| 00:00:46 |
|*  1 |  FILTER                                  |        |       |       |            |          |
|*  2 |   CONNECT BY NO FILTERING WITH START-WITH|        |       |       |            |          |
|   3 |    TABLE ACCESS FULL                     | TEST01 |    44M|  4162M|   173K  (1)| 00:00:07 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TEST01"."OBJECT_ID"=2 OR "TEST01"."OBJECT_ID"=3)
   2 - access("TEST01"."DATA_OBJECT_ID"=PRIOR "TEST01"."OBJECT_ID")
       filter("OBJECT_ID"=2)


Statistics
----------------------------------------------------------
       5449  recursive calls
    1596691  db block gets
     635906  consistent gets
     676955  physical reads   ---SQL已经反复多次执行,buffer cache确保能容纳TEST01,关闭了直接路径读
          0  redo size
       7595  bytes sent via SQL*Net to client
        851  bytes received via SQL*Net from client
         36  SQL*Net roundtrips to/from client
          1  sorts (memory)
          1  sorts (disk)
        512  rows processed

SQL2

复制代码
SQL> select /*+ NO_CONNECT_BY_FILTERING */ count(*)
  2    from test01
  3   start with object_id = 2
  4  connect by nocycle prior object_id = data_object_id;

Elapsed: 00:00:43.64

Execution Plan
----------------------------------------------------------
Plan hash value: 3439160689

-----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |               |     1 |    26 |   325K (48)| 00:00:13 |
|   1 |  SORT AGGREGATE               |               |     1 |    26 |            |          |
|*  2 |   CONNECT BY WITHOUT FILTERING|               |       |       |            |          |
|*  3 |    INDEX RANGE SCAN           | IDX_TEST01_N1 |   505 |  3535 |     4   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL          | TEST01        |    44M|   297M|   172K  (1)| 00:00:07 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DATA_OBJECT_ID"=PRIOR "OBJECT_ID")
   3 - access("OBJECT_ID"=2)


Statistics
----------------------------------------------------------
        727  recursive calls
    1660389  db block gets
     635910  consistent gets
      90036  physical reads ---SQL已经反复多次执行,buffer cache确保能容纳TEST01,关闭了直接路径读
          0  redo size
        362  bytes sent via SQL*Net to client
        460  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          1  sorts (disk)
          1  rows processed

SQL> select /*+ NO_CONNECT_BY_FILTERING FULL(TEST01) */ count(*)
  2    from test01
  3   start with object_id = 2
  4  connect by nocycle prior object_id = data_object_id;

Elapsed: 00:00:44.11

Execution Plan
----------------------------------------------------------
Plan hash value: 1977146487

---------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |        |     1 |    26 |   325K (48)| 00:00:13 |
|   1 |  SORT AGGREGATE                          |        |     1 |    26 |            |          |
|*  2 |   CONNECT BY NO FILTERING WITH START-WITH|        |       |       |            |          |
|   3 |    TABLE ACCESS FULL                     | TEST01 |    44M|   297M|   172K  (1)| 00:00:07 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DATA_OBJECT_ID"=PRIOR "OBJECT_ID")
       filter("OBJECT_ID"=2)

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (N - Unresolved (1))
---------------------------------------------------------------------------

   1 -  SEL$1
         N -  FULL(TEST01)


Statistics
----------------------------------------------------------
        727  recursive calls
    1660389  db block gets
     635906  consistent gets
      90030  physical reads ---SQL已经反复多次执行,buffer cache确保能容纳TEST01,关闭了直接路径读
          0  redo size
        362  bytes sent via SQL*Net to client
        473  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          1  sorts (disk)
          1  rows processed

崖山23.5.1

SQL1

复制代码
SQL> select *
  from test01 where object_id in(2,3)
 start with object_id = 2
connect by nocycle prior object_id = data_object_id;   2    3    4 

Execution Plan                                                   
---------------------------------------------------------------- 
SQL hash value: 2436060758                                      
Optimizer: ADOPT_C                                              
                                                                
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
| Id | Operation type                 | Name                 | Owner      | E - Rows | A - Rows | Cost(%CPU)  | A - Time | Loops    | Memory   | Disk     | Partition info                 |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
|  0 | SELECT STATEMENT               |                      |            |          |          |             |          |          |          |          |                                |
|* 1 |  RESULT                        |                      |            |       768|          |  1089546( 0)|          |          |          |          |                                |
|* 2 |   CONNECT BY HASH              |                      |            |  44537344|          |  1086511( 0)|          |          |          |          |                                |
|  3 |    TABLE ACCESS FULL           | TEST01               | SCOTT      |  44537344|          |  1086015( 0)|          |          |          |          |                                |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
                                                                
Operation Information (identified by operation id):             
---------------------------------------------------             
                                                                
   1 - Predicate : filter("TEST01"."OBJECT_ID" IN BINARY SEARCH : [2, 3])
   2 - Predicate : access(PRIOR "TEST01"."OBJECT_ID" = "TEST01"."DATA_OBJECT_ID")
                   filter("TEST01"."OBJECT_ID" = 2)             
Statistics
----------------------------------------------------------------------------------------------------

18 rows fetched.

Elapsed: 00:00:33.328

SQL2

复制代码
SQL> select count(*)
  from test01
 start with object_id = 2
connect by nocycle prior object_id = data_object_id;   2    3    4 

Execution Plan                                                   
---------------------------------------------------------------- 
SQL hash value: 3323709368                                      
Optimizer: ADOPT_C                                              
                                                                
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
| Id | Operation type                 | Name                 | Owner      | E - Rows | A - Rows | Cost(%CPU)  | A - Time | Loops    | Memory   | Disk     | Partition info                 |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
|  0 | SELECT STATEMENT               |                      |            |          |          |             |          |          |          |          |                                |
|  1 |  AGGREGATE                     |                      |            |         1|          |     5237( 0)|          |          |          |          |                                |
|* 2 |   CONNECT BY HASH              |                      |            |  44537344|          |     4480( 0)|          |          |          |          |                                |
|  3 |    INDEX FAST FULL SCAN        | IDX_TEST01_N1        | SCOTT      |  44537344|          |     3984( 0)|          |          |          |          |                                |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
                                                                
Operation Information (identified by operation id):             
---------------------------------------------------             
                                                                
   2 - Predicate : access(PRIOR "TEST01"."OBJECT_ID" = "TEST01"."DATA_OBJECT_ID")
                   filter("TEST01"."OBJECT_ID" = 2)             

Statistics
----------------------------------------------------------------------------------------------------

17 rows fetched.

Elapsed: 00:00:09.036

DM8

SQL1

复制代码
512 rows got

1   #NSET2: [6153845268876035, 2479468763217, 596] 
2     #PRJT2: [6153845268876035, 2479468763217, 596]; exp_num(15), is_atom(FALSE) 
3       #HASH RIGHT SEMI JOIN2: [6153845268876035, 2479468763217, 596]; key_num(1), MEM_USED(0KB), DISK_USED(0KB) KEY(DMTEMPVIEW_889194340.colname=TEST01.OBJECT_ID) KEY_NULL_EQU(0)
4         #CONST VALUE LIST: [1, 2, 30]; row_num(2), col_num(1)
5         #HIERARCHICAL QUERY: [6153843615896860, 49589375264358, 596]; key_num(1)
6           #BLKUP2: [6541, 1113433, 596]; IDX_TEST01_N1(TEST01)
7             #SSEK2: [6541, 1113433, 596]; scan_type(ASC), IDX_TEST01_N1(TEST01), is_global(0), scan_range[(exp_cast(2),min),(exp_cast(2),max))
8           #CSCN2: [10110, 44537344, 596]; INDEX33555493(TEST01); btr_scan(1)

Statistics
-----------------------------------------------------------------
        0     data pages changed
        0     undo pages changed
        36603084      logical reads
        1094314     physical reads
        0     redo size
        67379     bytes sent to client
        382     bytes received from client
        4     roundtrips to/from client
        0     sorts (memory)
        1     sorts (disk)
        0     rows processed
        32079     io wait time(ms)   ---io等待了32秒,SQL已经反复多次执行
        107915      exec time(ms)


used time: 00:01:34.376. Execute id is 708.

SQL2

复制代码
SQL> select count(*)
  from test01
 start with object_id = 2
connect by nocycle prior object_id = data_object_id;2   3   4   

LINEID     COUNT(*)            
---------- --------------------
1          4456960


1   #NSET2: [3177098600204064, 1, 60] 
2     #PRJT2: [3177098600204064, 1, 60]; exp_num(1), is_atom(FALSE) 
3       #AAGR2: [3177098600204064, 1, 60]; grp_num(0), sfun_num(1), distinct_flag[0]; slave_empty(0)
4         #HIERARCHICAL QUERY: [3177098600204064, 49589375264358, 60]; key_num(1)
5           #SSEK2: [158, 1113433, 60]; scan_type(ASC), IDX_TEST01_N1(TEST01), is_global(0), scan_range[(exp_cast(2),min),(exp_cast(2),max))
6           #SSCN: [5219, 44537344, 60]; IDX_TEST01_N1(TEST01); btr_scan(1); is_global(0)

Statistics
-----------------------------------------------------------------
        0     data pages changed
        0     undo pages changed
        22350276      logical reads
        214093      physical reads
        0     redo size
        138     bytes sent to client
        173     bytes received from client
        1     roundtrips to/from client
        0     sorts (memory)
        1     sorts (disk)
        0     rows processed
        5769      io wait time(ms)
        49413     exec time(ms)


used time: 00:00:49.413. Execute id is 711.

GaussDB

SQL1

复制代码
oracle=# explain analyze select *
oracle-#   from test01 where object_id in(2,3)
oracle-#  start with object_id = 2
oracle-# connect by nocycle prior object_id = data_object_id;
                                                                             QUERY PLAN                                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on tmp_reuslt  (cost=1597.86..1606.52 rows=4 width=788) (actual time=26.335..38347.584 rows=512 loops=1)
   Filter: (object_id = ANY ('{2,3}'::numeric[]))
   Rows Removed by Filter: 4456448
   CTE tmp_reuslt
     ->  StartWith Operator  (cost=4.52..1597.86 rows=385 width=195) (actual time=26.315..34366.881 rows=4456960 loops=1)
           Start With pseudo atts: array_key_4
           ->  Recursive Union  (cost=4.52..1597.86 rows=385 width=195) (actual time=0.133..25430.129 rows=4719104 loops=1)
                 ->  Bitmap Heap Scan on test01  (cost=4.52..144.22 rows=35 width=195) (actual time=0.132..1.459 rows=512 loops=1)
                       Recheck Cond: (object_id = 2)
                       Heap Blocks: exact=512
                       ->  Bitmap Index Scan on idx_test01_n1  (cost=0.00..4.51 rows=35 width=0) (actual time=0.078..0.078 rows=512 loops=1)
                             Index Cond: (object_id = 2)
                 ->  Nested Loop  (cost=4.52..144.59 rows=35 width=195) (actual time=16530.470..24550.782 rows=4718592 loops=4456960)
                       ->  WorkTable Scan on tmp_reuslt  (cost=0.00..0.02 rows=1 width=32) (actual time=239.643..486.658 rows=4456960 loops=4456960)
                       ->  Bitmap Heap Scan on test01  (cost=4.52..144.22 rows=35 width=195) (actual time=12952.710..20084.627 rows=4718592 loops=4456960)
                             Recheck Cond: (data_object_id = tmp_reuslt.object_id)
                             Heap Blocks: exact=4718592
                             ->  Bitmap Index Scan on idx_test01_n2  (cost=0.00..4.51 rows=35 width=0) (actual time=11181.021..11181.021 rows=4718592 loops=4456960)
                                   Index Cond: (data_object_id = tmp_reuslt.object_id)
 Total runtime: 38366.669 ms
(20 rows)

Time: 38374.623 ms

SQL2

复制代码
oracle=# explain analyze select count(*)
oracle-#   from test01
oracle-#  start with object_id = 2
oracle-# connect by nocycle prior object_id = data_object_id;
                                                                             QUERY PLAN                                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1606.52..1606.53 rows=1 width=8) (actual time=37241.527..37241.527 rows=1 loops=1)
   CTE tmp_reuslt
     ->  StartWith Operator  (cost=4.52..1597.86 rows=385 width=195) (actual time=24.808..33771.341 rows=4456960 loops=1)
           Start With pseudo atts: array_key_4
           ->  Recursive Union  (cost=4.52..1597.86 rows=385 width=195) (actual time=0.142..25120.971 rows=4719104 loops=1)
                 ->  Bitmap Heap Scan on test01  (cost=4.52..144.22 rows=35 width=195) (actual time=0.141..1.207 rows=512 loops=1)
                       Recheck Cond: (object_id = 2)
                       Heap Blocks: exact=512
                       ->  Bitmap Index Scan on idx_test01_n1  (cost=0.00..4.51 rows=35 width=0) (actual time=0.087..0.087 rows=512 loops=1)
                             Index Cond: (object_id = 2)
                 ->  Nested Loop  (cost=4.52..144.59 rows=35 width=195) (actual time=16282.481..24271.672 rows=4718592 loops=4456960)
                       ->  WorkTable Scan on tmp_reuslt  (cost=0.00..0.02 rows=1 width=32) (actual time=237.553..496.508 rows=4456960 loops=4456960)
                       ->  Bitmap Heap Scan on test01  (cost=4.52..144.22 rows=35 width=195) (actual time=12734.711..19878.902 rows=4718592 loops=4456960)
                             Recheck Cond: (data_object_id = tmp_reuslt.object_id)
                             Heap Blocks: exact=4718592
                             ->  Bitmap Index Scan on idx_test01_n2  (cost=0.00..4.51 rows=35 width=0) (actual time=10978.502..10978.502 rows=4718592 loops=4456960)
                                   Index Cond: (data_object_id = tmp_reuslt.object_id)
   ->  CTE Scan on tmp_reuslt  (cost=0.00..7.70 rows=385 width=0) (actual time=24.814..36609.173 rows=4456960 loops=1)
 Total runtime: 37265.295 ms
(19 rows)

Time: 37266.820 ms

测试结果汇总如下图所示

现在来研究一下Oracle为什么比其他数据库快那么多,先把SQL改写为WITH递归(以SQL1为例)

复制代码
SQL> with cte(owner, object_name, subobject_name, object_id, data_object_id,
  2           object_type, created, last_ddl_time, timestamp, status,
  3           temporary, generated, secondary, namespace, edition_name,
  4           lv, path) as
  5   (select owner, object_name, subobject_name, object_id, data_object_id,
  6           object_type, created, last_ddl_time, timestamp, status,
  7           temporary, generated, secondary, namespace, edition_name,
  8           1, '|' || object_id || '|'
  9      from test01
 10     where object_id = 2
 11    union all
 12    select /*+ use_nl(t,c) */
 13           t.owner, t.object_name, t.subobject_name, t.object_id, t.data_object_id,
 14           t.object_type, t.created, t.last_ddl_time, t.timestamp, t.status,
 15           t.temporary, t.generated, t.secondary, t.namespace, t.edition_name,
 16           c.lv + 1,
 17           c.path || t.object_id || '|'
 18      from test01 t join cte c
 19        on c.object_id = t.data_object_id
 20     where c.path not like '%|' || t.object_id || '|%'
 21    )
 22  select *
 23    from cte
 24   where object_id in (2,3);

512 rows selected.

Elapsed: 00:00:28.09

Execution Plan
----------------------------------------------------------
Plan hash value: 3294323809

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |               |  3839G|  7758T|    18E  (0)|999:59:59 |
|*  1 |  VIEW                                     |               |  3839G|  7758T|    18E  (0)|999:59:59 |
|   2 |   UNION ALL (RECURSIVE WITH) BREADTH FIRST|               |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED    | TEST01        |   505 | 49490 |   510   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                      | IDX_TEST01_N1 |   505 |       |     4   (0)| 00:00:01 |
|   5 |    NESTED LOOPS                           |               |  3839G|  7423T|    18E  (0)|999:59:59 |
|   6 |     NESTED LOOPS                          |               |  3869G|  7423T|    18E  (0)|999:59:59 |
|   7 |      RECURSIVE WITH PUMP                  |               |       |       |            |          |
|*  8 |      INDEX RANGE SCAN                     | IDX_TEST01_N2 |    26 |       |    13   (0)| 00:00:01 |
|   9 |     TABLE ACCESS BY INDEX ROWID           | TEST01        |    26 |  2548 |    39   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID"=2 OR "OBJECT_ID"=3)
   4 - access("OBJECT_ID"=2)
   8 - access("C"."OBJECT_ID"="T"."DATA_OBJECT_ID")
       filter("T"."DATA_OBJECT_ID" IS NOT NULL AND "C"."PATH" NOT LIKE
              '%|'||TO_CHAR("T"."OBJECT_ID")||'|%')

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------

   5 -  SEL$F1D6E378 / C@SEL$1
         U -  use_nl(t,c)


Statistics
----------------------------------------------------------
        209  recursive calls
   29564091  db block gets
    4541494  consistent gets
      15368  physical reads
          0  redo size
       8227  bytes sent via SQL*Net to client
       1711  bytes received via SQL*Net from client
         36  SQL*Net roundtrips to/from client
          2  sorts (memory)
          1  sorts (disk)
        512  rows processed

注意观察执行计划,UNION ALL (RECURSIVE WITH) BREADTH FIRST说明WITH递归用的是广度优先算法,把SQL改写为WITH SEARCH高级递归写法,指定深度优先,执行计划显示为A-TIME样式

复制代码
with cte (
    owner, object_name, subobject_name, object_id, data_object_id,
    object_type, created, last_ddl_time, timestamp, status,
    temporary, generated, secondary, namespace, edition_name, lvl
) as (
    select
        owner, object_name, subobject_name, object_id, data_object_id,
        object_type, created, last_ddl_time, timestamp, status,
        temporary, generated, secondary, namespace, edition_name,
        1
    from test01
    where object_id = 2
    union all
    select /*+ use_nl(t,c) */
        t.owner, t.object_name, t.subobject_name, t.object_id, t.data_object_id,
        t.object_type, t.created, t.last_ddl_time, t.timestamp, t.status,
        t.temporary, t.generated, t.secondary, t.namespace, t.edition_name,
        c.lvl + 1
    from test01 t
    join cte c on c.object_id = t.data_object_id
)
search depth first by object_id set order1
cycle object_id set is_cycle to 'Y' default 'N'
select * from cte
where object_id in (2,3) and is_cycle = 'N';

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  cna60m0r6uhkf, child number 0
-------------------------------------
with cte (     owner, object_name, subobject_name, object_id,
data_object_id,     object_type, created, last_ddl_time, timestamp,
status,     temporary, generated, secondary, namespace, edition_name,
lvl ) as (     select         owner, object_name, subobject_name,
object_id, data_object_id,         object_type, created, last_ddl_time,
timestamp, status,         temporary, generated, secondary, namespace,
edition_name,         1     from test01     where object_id = 2
union all     select /*+ use_nl(t,c) */         t.owner, t.object_name,
t.subobject_name, t.object_id, t.data_object_id,         t.object_type,
t.created, t.last_ddl_time, t.timestamp, t.status,         t.temporary,
t.generated, t.secondary, t.namespace, t.edition_name,         c.lvl +
1     from test01 t     join cte c on c.object_id = t.data_object_id )
search depth first by object_id set order1 cycle object_id set is_cycle
to 'Y' default 'N' select * from cte where object_id in (2,3) and
is_cycle = 'N'

Plan hash value: 769798160

----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |               |      1 |        |    512 |00:00:01.81 |    9792 |       |       |          |
|*  1 |  VIEW                                   |               |      1 |     18E|    512 |00:00:01.81 |    9792 |       |       |          |
|   2 |   UNION ALL (RECURSIVE WITH) DEPTH FIRST|               |      1 |        |   4719K|00:00:01.37 |    9792 |  1612K|   624K| 1432K (0)|
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED  | TEST01        |      1 |    505 |    512 |00:00:00.01 |     516 |       |       |          |
|*  4 |     INDEX RANGE SCAN                    | IDX_TEST01_N1 |      1 |    505 |    512 |00:00:00.01 |       4 |       |       |          |
|   5 |    NESTED LOOPS                         |               |      2 |     18E|   9216 |00:00:00.03 |    9276 |       |       |          |
|   6 |     NESTED LOOPS                        |               |      2 |     18E|   9216 |00:00:00.01 |      60 |       |       |          |
|   7 |      RECURSIVE WITH PUMP                |               |      2 |        |     18 |00:00:00.01 |       0 |       |       |          |
|*  8 |      INDEX RANGE SCAN                   | IDX_TEST01_N2 |     18 |    516 |   9216 |00:00:00.01 |      60 |       |       |          |
|   9 |     TABLE ACCESS BY INDEX ROWID         | TEST01        |   9216 |    516 |   9216 |00:00:00.03 |    9216 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter((INTERNAL_FUNCTION("OBJECT_ID") AND "IS_CYCLE"='N'))
   4 - access("OBJECT_ID"=2)
   8 - access("C"."OBJECT_ID"="T"."DATA_OBJECT_ID")
       filter("T"."DATA_OBJECT_ID" IS NOT NULL)


43 rows selected.

指定广度优先,执行计划显示为A-TIME样式

复制代码
with cte (
    owner, object_name, subobject_name, object_id, data_object_id,
    object_type, created, last_ddl_time, timestamp, status,
    temporary, generated, secondary, namespace, edition_name, lvl
) as (
    select
        owner, object_name, subobject_name, object_id, data_object_id,
        object_type, created, last_ddl_time, timestamp, status,
        temporary, generated, secondary, namespace, edition_name,
        1
    from test01
    where object_id = 2
    union all
    select /*+ use_nl(t,c) */
        t.owner, t.object_name, t.subobject_name, t.object_id, t.data_object_id,
        t.object_type, t.created, t.last_ddl_time, t.timestamp, t.status,
        t.temporary, t.generated, t.secondary, t.namespace, t.edition_name,
        c.lvl + 1
    from test01 t
    join cte c on c.object_id = t.data_object_id
)
search breadth first by object_id set order1
cycle object_id set is_cycle to 'Y' default 'N'
select * from cte
where object_id in (2,3) and is_cycle = 'N';

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  9hbu8p6s226v4, child number 0
-------------------------------------
with cte (     owner, object_name, subobject_name, object_id,
data_object_id,     object_type, created, last_ddl_time, timestamp,
status,     temporary, generated, secondary, namespace, edition_name,
lvl ) as (     select         owner, object_name, subobject_name,
object_id, data_object_id,         object_type, created, last_ddl_time,
timestamp, status,         temporary, generated, secondary, namespace,
edition_name,         1     from test01     where object_id = 2
union all     select /*+ use_nl(t,c) */         t.owner, t.object_name,
t.subobject_name, t.object_id, t.data_object_id,         t.object_type,
t.created, t.last_ddl_time, t.timestamp, t.status,         t.temporary,
t.generated, t.secondary, t.namespace, t.edition_name,         c.lvl +
1     from test01 t     join cte c on c.object_id = t.data_object_id )
search breadth first by object_id set order1 cycle object_id set
is_cycle to 'Y' default 'N' select * from cte where object_id in (2,3)
and is_cycle = 'N'

Plan hash value: 3294323809

------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |               |      1 |        |    512 |00:00:34.83 |      36M|    202K|    134K|       |       |          |
|*  1 |  VIEW                                     |               |      1 |     18E|    512 |00:00:34.83 |      36M|    202K|    134K|       |       |          |
|   2 |   UNION ALL (RECURSIVE WITH) BREADTH FIRST|               |      1 |        |   4719K|00:00:34.39 |      36M|    202K|    134K|   592M|  7993K|   97M (1)|
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED    | TEST01        |      1 |    505 |    512 |00:00:00.01 |     516 |      0 |      0 |       |       |          |
|*  4 |     INDEX RANGE SCAN                      | IDX_TEST01_N1 |      1 |    505 |    512 |00:00:00.01 |       4 |      0 |      0 |       |       |          |
|   5 |    NESTED LOOPS                           |               |      2 |     18E|   4718K|00:00:10.06 |    4730K|  67352 |      0 |       |       |          |
|   6 |     NESTED LOOPS                          |               |      2 |     18E|   4718K|00:00:03.15 |   11422 |  67352 |      0 |       |       |          |
|   7 |      RECURSIVE WITH PUMP                  |               |      2 |        |   4456K|00:00:00.54 |       0 |  67352 |      0 |       |       |          |
|*  8 |      INDEX RANGE SCAN                     | IDX_TEST01_N2 |   4456K|    516 |   4718K|00:00:01.88 |   11422 |      0 |      0 |       |       |          |
|   9 |     TABLE ACCESS BY INDEX ROWID           | TEST01        |   4718K|    516 |   4718K|00:00:05.91 |    4718K|      0 |      0 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter((INTERNAL_FUNCTION("OBJECT_ID") AND "IS_CYCLE"='N'))
   4 - access("OBJECT_ID"=2)
   8 - access("C"."OBJECT_ID"="T"."DATA_OBJECT_ID")
       filter("T"."DATA_OBJECT_ID" IS NOT NULL)


43 rows selected.

Oracle深度优先循环了9216次,Oracle广度优先循环了4456K次,ORACLE START WITH CONNECT BY默认用的是深度优先算法,普通的WITH递归用的是广度优先算法,现在破案了,Oracle之所以跑得快是因为默认采用深度优先算法,把需要处理的数据量给缩小了

测试的几款国产数据库目前还没实现WITH SEARCH高级递归语法,PG已经实现了,在PG18上看看性能如何

PG深度优先

复制代码
postgres=# explain analyze with recursive cte as (
postgres(#     select 
postgres(#         owner, object_name, subobject_name, object_id, data_object_id,
postgres(#         object_type, created, last_ddl_time, timestamp, status,
postgres(#         temporary, generated, secondary, namespace, edition_name,
postgres(#         1 as lvl
postgres(#     from test01
postgres(#     where object_id = 2
postgres(#     union all
postgres(#     select 
postgres(#         t.owner, t.object_name, t.subobject_name, t.object_id, t.data_object_id,
postgres(#         t.object_type, t.created, t.last_ddl_time, t.timestamp, t.status,
postgres(#         t.temporary, t.generated, t.secondary, t.namespace, t.edition_name,
postgres(#         c.lvl + 1
postgres(#     from test01 t
postgres(#     join cte c on c.object_id = t.data_object_id
postgres(# )
postgres-# search depth first by object_id set ordercol
postgres-# cycle object_id set is_cycle using path
postgres-# select * from cte
postgres-# where object_id in (2, 3) 
postgres-#   and not is_cycle;
                                                                         QUERY PLAN                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on cte  (cost=31423106.70..31734434.41 rows=69184 width=849) (actual time=0.089..15461.487 rows=512.00 loops=1)
   Filter: ((NOT is_cycle) AND (object_id = ANY ('{2,3}'::numeric[])))
   Rows Removed by Filter: 4718592
   Storage: Disk  Maximum Storage: 1249896kB
   Buffers: shared hit=22565381, temp read=156224 written=312461
   CTE cte
     ->  Recursive Union  (cost=0.56..31423106.70 rows=13836787 width=256) (actual time=0.085..12831.550 rows=4719104.00 loops=1)
           Storage: Disk  Maximum Storage: 1249921kB
           Buffers: shared hit=22565381, temp read=156224 written=156224
           ->  Index Scan using idx_test01_n1 on test01  (cost=0.56..2085.60 rows=517 width=256) (actual time=0.083..0.611 rows=512.00 loops=1)
                 Index Cond: (object_id = '2'::numeric)
                 Index Searches: 1
                 Buffers: shared hit=517
           ->  Nested Loop  (cost=0.56..3128265.32 rows=1383627 width=256) (actual time=3039.122..5690.770 rows=2359296.00 loops=2)
                 Buffers: shared hit=22564864, temp read=156224 written=1
                 ->  WorkTable Scan on cte c  (cost=0.00..103.40 rows=2585 width=100) (actual time=0.043..483.035 rows=2228480.00 loops=2)
                       Filter: (NOT is_cycle)
                       Rows Removed by Filter: 131072
                       Buffers: temp read=156224 written=1
                 ->  Index Scan using idx_test01_n2 on test01 t  (cost=0.56..1194.07 rows=535 width=187) (actual time=0.001..0.002 rows=1.06 loops=4456960)
                       Index Cond: (data_object_id = c.object_id)
                       Index Searches: 4456960
                       Buffers: shared hit=22564864
 Planning Time: 0.187 ms
 Execution Time: 15698.086 ms
(25 rows)

Time: 15698.654 ms (00:15.699)

PG广度优先

复制代码
postgres=# explain analyze with recursive cte as (
postgres(#     select 
postgres(#         owner, object_name, subobject_name, object_id, data_object_id,
postgres(#         object_type, created, last_ddl_time, timestamp, status,
postgres(#         temporary, generated, secondary, namespace, edition_name,
postgres(#         1 as lvl
postgres(#     from test01
postgres(#     where object_id = 2
postgres(#     union all
postgres(#     select 
postgres(#         t.owner, t.object_name, t.subobject_name, t.object_id, t.data_object_id,
postgres(#         t.object_type, t.created, t.last_ddl_time, t.timestamp, t.status,
postgres(#         t.temporary, t.generated, t.secondary, t.namespace, t.edition_name,
postgres(#         c.lvl + 1
postgres(#     from test01 t
postgres(#     join cte c on c.object_id = t.data_object_id
postgres(# )
postgres-# search breadth first by object_id set ordercol
postgres-# cycle object_id set is_cycle using path
postgres-# select * from cte
postgres-# where object_id in (2, 3) 
postgres-#   and not is_cycle;
                                                                         QUERY PLAN                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on cte  (cost=31423106.70..31734434.41 rows=69184 width=849) (actual time=0.037..15149.188 rows=512.00 loops=1)
   Filter: ((NOT is_cycle) AND (object_id = ANY ('{2,3}'::numeric[])))
   Rows Removed by Filter: 4718592
   Storage: Disk  Maximum Storage: 1014872kB
   Buffers: shared hit=22565381, temp read=126848 written=253707
   CTE cte
     ->  Recursive Union  (cost=0.56..31423106.70 rows=13836787 width=256) (actual time=0.033..12543.080 rows=4719104.00 loops=1)
           Storage: Disk  Maximum Storage: 1014905kB
           Buffers: shared hit=22565381, temp read=126848 written=126848
           ->  Index Scan using idx_test01_n1 on test01  (cost=0.56..2085.60 rows=517 width=256) (actual time=0.032..0.597 rows=512.00 loops=1)
                 Index Cond: (object_id = '2'::numeric)
                 Index Searches: 1
                 Buffers: shared hit=517
           ->  Nested Loop  (cost=0.56..3128265.32 rows=1383627 width=256) (actual time=3022.640..5546.855 rows=2359296.00 loops=2)
                 Buffers: shared hit=22564864, temp read=126848 written=1
                 ->  WorkTable Scan on cte c  (cost=0.00..103.40 rows=2585 width=100) (actual time=0.042..452.027 rows=2228480.00 loops=2)
                       Filter: (NOT is_cycle)
                       Rows Removed by Filter: 131072
                       Buffers: temp read=126848 written=1
                 ->  Index Scan using idx_test01_n2 on test01 t  (cost=0.56..1194.07 rows=535 width=187) (actual time=0.001..0.002 rows=1.06 loops=4456960)
                       Index Cond: (data_object_id = c.object_id)
                       Index Searches: 4456960
                       Buffers: shared hit=22564864
 Planning Time: 0.212 ms
 Execution Time: 15348.194 ms
(25 rows)

Time: 15348.865 ms (00:15.349)

PG深度优先循环了4456960次,PG广度优先也循环了4456960次,PG的深度优先没有把数据量减少

崖山CONNECT BY默认也是用的深度优先,但是性能和Oracle比相差太大,得继续优化

最后,针对树形查询,给国产数据库提点建议:

1.实现WITH SEARCH DEPTH/BREADTH FIRST语法,且真正实现深度优先,广度优先算法

2.实现Oracle的/*+ CONNECT_BY_FILTERING / 和 / + NO_CONNECT_BY_FILTERING */功能

3.START WITH CONNECT BY能控制选择走深度优先或广度优先

4.树形查询能够开启并行查询

5.树形查询能够利用批量模式

6.性能追上甚至超越Oracle

相关推荐
仟濹13 小时前
【算法打卡day6 | 2026-02-11 周三 | 算法: BFS and BFS】| 8_卡码网104_建造最大岛屿 | 9_卡码网106_海岸线计算
算法·深度优先·广度优先·宽度优先
TracyCoder12314 小时前
LeetCode Hot100(35/100)——200. 岛屿数量
算法·leetcode·深度优先
会编程的土豆19 小时前
深度优先搜索dfs和广度优先搜索例题bfs
算法·深度优先·图论·洛谷
vmlogin221 天前
vmlogin如何设置SpeechSynthesis API指纹?
windows·广度优先·facebook
近津薪荼2 天前
dfs专题6——二叉树的所有路径
c++·学习·算法·深度优先
觅特科技-互站2 天前
政务AI口播落地难?矩阵跃动一体机实测:本地离线+等保三级,某省大数据局3天完成信创部署
大数据·人工智能·深度优先·kmeans·政务
会编程的土豆2 天前
深度优先搜索刷题(方向数组+DFS思想):解决矩阵单词搜索问题的标准解法
算法·深度优先·图论·洛谷
YuTaoShao2 天前
【LeetCode 每日一题】1382. 将二叉搜索树变平衡——中序遍历 - 数组 - 平衡 BST
算法·leetcode·深度优先
YuTaoShao2 天前
【LeetCode 每日一题】207. 课程表
算法·leetcode·深度优先