1、or条件简单解释
or是一种逻辑运算,or条件成立的情况如下:
条件1 条件2 or运算
真 真 真
真 假 真
假 真 真
假 假 假
or运算相当于只要满足一个条件为真,它就返回真。在运算中与union all计算有相似之处
集合1 集合2 union all结果
有数据 有数据 有数据
有数据 无数据 有数据
无数据 有数据 有数据
无数据 无数据 无数据
只要一个集合中有数据,union all就有数据。
因此在优化中经常用此作为改写替换的方法。
2、分析
css
or关联分析
drop table t0705;
create table t0705 (id int primary key,c1 varchar2(20),c2 varchar2(20));
insert into t0705 select level,'A'||level,'B'||level from dual connect by level<=500000;
commit;
drop table t07051;
create table t07051 (id int primary key,c1 varchar2(20),c2 varchar2(20));
insert into t07051 select level,'A'||level,'B'||level from dual connect by level<=500000;
commit;
insert into t07051 select level+500000,'A'||level,'B'||level from dual connect by level<=500000;
commit;
drop table t07052;
create table t07052 (id int primary key,c1 varchar2(20),c2 varchar2(20));
insert into t07052 select level,'A'||level,'B'||level from dual connect by level<=10000;
commit;
insert into t07052 select level+10000,'B'||level,'B'||level from dual connect by level<=900000;
commit;
create index IDX_DM_t07051 on t07051(c1);
create index IDX_DM_t07052 on t07052(c1);
create index IDX_DM_T0705_c1 on t0705(c1);
create index IDX_DM_T0705_c2 on t0705(c2);
dbms_stats.gather_table_stats(USER,'T0705',null,100);
dbms_stats.gather_table_stats(USER,'T07051',null,100);
dbms_stats.gather_table_stats(USER,'T07052',null,100);
语句1:
select /*+OPTIMIZER_OR_NBEXP(0) */count(*) from (
select distinct t2.id,t3.c1 from t0705 t1,t07051 t2,t07052 t3 where (t1.c1=t2.c1 or t1.c2=t2.c1)
and (t1.c1=t3.c1 or t1.c2=t3.c1) and t2.id=t3.id
) t
计划:

这个or条件,使得t1,t2,t3扫描做了四次
看看or作为整体计算的情况:
计划的第6行这里是把or整体做,相当于三个表关联后再做or运算,在语句中t2和t3它由关联条件t2.id=t3.id,但是t1和t3的关联条件是or它已经合并起来做,那这里意味着t1和t3没有直接关联的条件了,所以做成笛卡尔积了。
上面无论怎么做,似乎并不高效。前面说到or和union all之间有相似之处,我们从or参数也能知道和union all有关。
or条件主要是关联t1表的t1.c1和t1.c2列,那我们可以把t1表的关联列先union all后拼成一列再与其他表做关联。
改写方案:
css
select count(*) from (
select distinct t2.id,t3.c1 from (select c1 from t0705
union all
select c2 from t0705) t1,t07051 t2,t07052 t3 where (t1.c1=t2.c1)
and (t1.c1=t3.c1 ) and t2.id=t3.id
) t
计划:

原来的语句执行时间:2.2s
改写后:0.8s
语句2:
css
select count(*) from t07051 t1,t07052 t2 where exists (select 1 from t0705 t where (t.c1=t1.c1 or t.c2=t1.c1) and (t.c1=t2.c1 or t.c2=t2.c1))
and t1.id=t2.id;
计划:

通过对语句1改写可得
css
select count(*) from t07051 t1,t07052 t2 where exists
(select 1 from (select c1 from t0705 t
union all
select c2 from t0705 t
) t
where (t.c1=t1.c1 ) and (t.c1=t2.c1 ))
and t1.id=t2.id;
计划:
我们也可以把union all把or进行拆分
select count(*) from t07051 t1,t07052 t2 where
exists (select 1 from t0705 t where (t.c1=t1.c1) and (t.c1=t2.c1)
union all
select 1 from t0705 t where (t.c2=t1.c1) and (t.c2=t2.c1))
and t1.id=t2.id;
计划:

从第二个改写方案进而我们可以把exists union all方式改写成exists or exists的方式
css
select count(*) from t07051 t1,t07052 t2 where (
exists (select 1 from t0705 t where (t.c1=t1.c1) and (t.c1=t2.c1)
)or exists (
select 1 from t0705 t where (t.c2=t1.c1) and (t.c2=t2.c1)))
and t1.id=t2.id;
计划:

改写方案基本上比原先的or关联快上一倍。
3、总结
or关联无论是把or拆分还是整体做,并不高效,在优化中我们一般是把or用union all去做等价替代。这里有几点要注意
(1) 下面情况不等价
css
create table t1 (id int primary key,c1 varchar2(20),c2 varchar2(20));
insert into t1 values(1,'AA','BB');
insert into t1 values(2,'CC','AA');
commit;
drop table if exists t2;
create table t2 (id int primary key,c1 varchar2(20),c2 varchar2(20));
insert into t2 values(1,'AA','BB');
insert into t2 values(2,'AB','BA');
insert into t2 values(3,'BB','AA');
commit;
drop table if exists t3;
create table t3 (id int primary key,c1 varchar2(20),c2 varchar2(20));
insert into t3 values(1,'AA','BB');
insert into t3 values(2,'AB','BA');
insert into t3 values(3,'BB','AA');
commit;
select t1.c1 as t1c1,t2.c1 as t2c1,t3.c1 as t3c1 from t1,t2,t3 where t2.id=t3.id and (t1.c1=t2.c1 or t1.c2=t2.c1)
AA AA AA
AA BB BB
CC AA AA
不等价于
select t1.c1 as t1c1,t2.c1 as t2c1,t3.c1 as t3c1 from (select c1 from t1
union all
select c2 from t1
) t1
,t2,t3 where t2.id=t3.id and t1.c1=t2.c1
AA AA AA
BB BB BB
AA AA AA
这种情况是要显示t1.c1情况的,下面的c1是union all是把t1.c1和t1.c2合并了,并不是实际意义上的c1了,因此这里需要加一列把真正的c1列查询出来
所以改写:
css
select t1c1 as t1c1,t2.c1 as t2c1,t3.c1 as t3c1 from (select c1,c1 as t1c1 from t1
union all
select c2,c1 as t1c1 from t1
) t1
,t2,t3 where t2.id=t3.id and t1.c1=t2.c1
AA AA AA
AA BB BB
CC AA AA
有时or改写成union all需要注意结果要的是什么,另外就是union all它不做去重,这一点与or有点区别。
(2)or运算优先级比and的低,如果要优先计算,记得加()。
(3)改写方案最终是不要受到参数的影响。