反连接
反连接(Anti Join)是一种特殊的连接类型,与内连接和外连接不同,Oracle数据库里并没有相关的
关键字可以在SQL文本中专门表示反连接,所以这里把它单独拿出来说明。
为了方便说明反连接的含义,我们用"t1.x.anti=t2.y"来表示表T1和表T2做反连接,且T1是驱动表,T2
是被驱动表,反连接条件为t1.x=t2.y,这里"t1.x.anti=t2.y"的含义是只要表T2中有满足条件t1.x=t2.y
的记录存在,则表T1中满足条件t1.x=t2.y的记录就会被丢弃,最后返回的记录就是表T1中那些不满足
条件t1.x=t2.y的记录。
当做子查询展开时,Oracle经常会把那些外部where条件为NOT EXISTS,NOT IN或<>ALL的子查询转换成
对应的反连接。
我们来看如下的范例SQL1、2和3
sql
SQL> select * from t1 where col2 not in (select col2 from t2);
SQL> select * from t1 where col2<>all (select col2 from t2);
SQL> select * from t1 where not exists (select 1 from t2 where col2=t1.col2);
现在表T1和t2在各自的连接列COL2上均没有NULL值,在这种情况下范例SQL1、2、3实际上是等价的。
sql
SQL> select * from t1 where col2 not in (select col2 from t2);
COL1 C
---------- -
3 C
Execution Plan
----------------------------------------------------------
Plan hash value: 1275484728
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("COL2"="COL2")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
590 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
sql
SQL> select * from t1 where col2<>all (select col2 from t2);
COL1 C
---------- -
3 C
Execution Plan
----------------------------------------------------------
Plan hash value: 1275484728
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("COL2"="COL2")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
590 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
sql
SQL> select * from t1 where not exists (select 1 from t2 where col2=t1.col2);
COL1 C
---------- -
3 C
Execution Plan
----------------------------------------------------------
Plan hash value: 2706079091
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("COL2"="T1"."COL2")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
590 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
上述3个范例SQL的执行结果是一样的,范例SQL1、2和范例SQL3的执行计划中,ID=1的执行步骤的列Operation的值分别为"HASH JOIN ANTI NA"和"HASH JOIN ANTI",虽然不是完全一样,但它们都有关键字"ANTI",这就说明Oracle在执行上述三个范例SQL时确实是在用反连接,即Oracle在执行时实际上是将他们转换成了如下的等价反连接形式:
sql
select t1.* from t1,t2 where t1.col2 anti=t2.col2;
这里表T1、T2在各自的连接列COL2上没有NULL值,所以此时这三个范例SQL是等价的,但如果连接列COL2上有null值,则它们就不完全等价了。这种null值所带来的影响又细分为两种情况:
1、表T1的连接列COL2上出现了NULL值
sql
SQL> insert into t1 values(4,null);
1 row created.
SQL> commit;
Commit complete.
表T1中的记录如下:
sql
SQL> select * from t1;
COL1 C
---------- -
1 A
2 B
3 C
4
分别执行SQL1、2、3
sql
SQL> select * from t1 where col2 not in (select col2 from t2);
COL1 C
---------- -
3 C
Execution Plan
----------------------------------------------------------
Plan hash value: 1275484728
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("COL2"="COL2")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
590 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
sql
SQL> select * from t1 where col2 <>all (select col2 from t2);
COL1 C
---------- -
3 C
Execution Plan
----------------------------------------------------------
Plan hash value: 1275484728
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("COL2"="COL2")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
590 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
sql
SQL> select * from t1 where not exists (select 1 from t2 where col2=t1.col2);
COL1 C
---------- -
4
3 C
Execution Plan
----------------------------------------------------------
Plan hash value: 2706079091
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("COL2"="T1"."COL2")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
645 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed
sql
SQL> delete from t1 where col1=4;
1 row deleted.
SQL> commit;
Commit complete.
2、表T2的连接列COL2上出现了NULL值
sql
SQL> insert into t2 values(null,'E2');
1 row created.
SQL> COMMIT;
Commit complete.
SQL> select * from t1;
COL1 C
---------- -
1 A
2 B
3 C
SQL> select * from t2;
C CO
- --
A A2
B B2
D D2
E2
再次执行上述三个SQL
sql
SQL> select * from t1 where col2 not in (select col2 from t2);
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 1275484728
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("COL2"="COL2")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
10 consistent gets
0 physical reads
0 redo size
398 bytes sent via SQL*Net to client
512 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
sql
SQL> select * from t1 where col2 <>all (select col2 from t2);
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 1275484728
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("COL2"="COL2")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
10 consistent gets
0 physical reads
0 redo size
398 bytes sent via SQL*Net to client
512 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
sql
SQL> select * from t1 where not exists (select 1 from t2 where col2=t1.col2);
COL1 C
---------- -
3 C
Execution Plan
----------------------------------------------------------
Plan hash value: 2706079091
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("COL2"="T1"."COL2")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
590 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
sql
SQL> delete from t2 where col3='E2';
1 row deleted.
SQL> COMMIT;
Commit complete.
从上述测试中我们可以得出以下结论:
1)表T1、T2在各自的连接列COL2上一但有了NULL值,则范例SQL1、2、3就不完全等价了。
2)NOT IN和<>ALL对NULL值敏感,这意味着NOT IN后面的子查询或者常量集合一旦有NULL值出现,则整个SQL的执行结果就会为NULL,即此时的执行结果将不包含任何记录。
3)NOT EXISTS对NULL值不敏感,这意味着NULL值对NOT EXISTS的执行结果不会有什么影响。正是因为NOT IN和<>ALL对NULL值敏感,所以一旦相关的连接列上出现NULL值,此时Oracle如果还按照通常的反连接的处理逻辑来处理,得到的结果就不对了。
为了解决NOT IN和<>ALL对NULL值敏感的问题,Oracle推出了改良的反连接,这种反连接能够处理NULL值,Oracle称其为Null-Aware Anti Join。上述范例SQL1、2的执行计划中,ID=1的执行步骤的列Operation的值为"HASH JOIN ANTI NA",关键字NA就是Null-Aware的缩写。Oracle就是想告诉我们,这里采用的不是普通的哈希反连接,而是改良后的,能够处理NULL值的哈希反连接。
在Oracle 11g R2中,Oracle是否启用Null-Aware Anti Join受隐含参数_OPTIMIZER_NULL_AWARE_ANTIJOIN控制,其默认值为TRUE,表示启用Null-Aware Anti join。
如果我们把_OPTIMIZER_NULL_AWARE_ANTIJOIN的值修改为FALSE,则Oracle就不能再用Null-Aware Anti Join了,而又因为NOT IN对NULL值敏感,所以Oracle此时也不能用普通的反连接。
设置当前session中的_OPTIMIZER_NULL_AWARE_ANTIJOIN值为FALSE
sql
alter session set "_OPTIMIZER_NULL_AWARE_ANTIJOIN"=false;
修改后执行范例SQL1:
从上述显示内容可以看出,当我们把_OPTIMIZER_NULL_AWARE_ANTIJOIN的值修改为FALSE后,Oracle果然没有走反连接(当然也不能走)。
这里Oracle选择了走FILTER类型的执行计划,FILTER类型的执行计划实际上是一种改良的嵌套循环连接。