OR关联改写经验

1、or条件简单解释

or是一种逻辑运算,or条件成立的情况如下:

条件1 条件2 or运算

真 真 真

真 假 真

假 真 真

假 假 假

or运算相当于只要满足一个条件为真,它就返回真。在运算中与union all计算有相似之处

集合1 集合2 union all结果

有数据 有数据 有数据

有数据 无数据 有数据

无数据 有数据 有数据

无数据 无数据 无数据

只要一个集合中有数据,union all就有数据。

因此在优化中经常用此作为改写替换的方法。

2、分析

css 复制代码
or关联分析
drop table t0705;
create table t0705 (id int primary key,c1 varchar2(20),c2 varchar2(20));
insert into t0705 select level,'A'||level,'B'||level from dual connect by level<=500000;
commit;
drop table t07051;
create table t07051 (id int primary key,c1 varchar2(20),c2 varchar2(20));

insert into t07051 select level,'A'||level,'B'||level from dual connect by level<=500000;
commit;
insert into t07051 select level+500000,'A'||level,'B'||level from dual connect by level<=500000;
commit;

drop table t07052;
create table t07052 (id int primary key,c1 varchar2(20),c2 varchar2(20));

insert into t07052 select level,'A'||level,'B'||level from dual connect by level<=10000;
commit;
insert into t07052 select level+10000,'B'||level,'B'||level from dual connect by level<=900000;
commit;

create index IDX_DM_t07051 on t07051(c1);
create index IDX_DM_t07052 on t07052(c1);
create index IDX_DM_T0705_c1 on t0705(c1);
create index IDX_DM_T0705_c2 on t0705(c2);

dbms_stats.gather_table_stats(USER,'T0705',null,100);
dbms_stats.gather_table_stats(USER,'T07051',null,100);
dbms_stats.gather_table_stats(USER,'T07052',null,100);
语句1:
select /*+OPTIMIZER_OR_NBEXP(0) */count(*) from (
select distinct t2.id,t3.c1 from t0705 t1,t07051 t2,t07052 t3 where (t1.c1=t2.c1 or t1.c2=t2.c1)
and (t1.c1=t3.c1 or t1.c2=t3.c1) and t2.id=t3.id
) t

计划:

这个or条件,使得t1,t2,t3扫描做了四次

看看or作为整体计算的情况:
计划的第6行这里是把or整体做,相当于三个表关联后再做or运算,在语句中t2和t3它由关联条件t2.id=t3.id,但是t1和t3的关联条件是or它已经合并起来做,那这里意味着t1和t3没有直接关联的条件了,所以做成笛卡尔积了。

上面无论怎么做,似乎并不高效。前面说到or和union all之间有相似之处,我们从or参数也能知道和union all有关。

or条件主要是关联t1表的t1.c1和t1.c2列,那我们可以把t1表的关联列先union all后拼成一列再与其他表做关联。

改写方案:

css 复制代码
select count(*) from (
select distinct t2.id,t3.c1 from (select c1 from t0705
union all
select c2 from t0705) t1,t07051 t2,t07052 t3 where (t1.c1=t2.c1)
and (t1.c1=t3.c1 ) and t2.id=t3.id
) t 

计划:

原来的语句执行时间:2.2s

改写后:0.8s

语句2:

css 复制代码
select count(*) from t07051 t1,t07052 t2 where exists (select 1 from t0705 t where (t.c1=t1.c1 or t.c2=t1.c1) and (t.c1=t2.c1 or t.c2=t2.c1))
and t1.id=t2.id;

计划:

通过对语句1改写可得

css 复制代码
select count(*) from t07051 t1,t07052 t2 where exists 
(select 1 from (select c1 from t0705 t 
union all
select c2 from t0705 t
) t 
where (t.c1=t1.c1 ) and (t.c1=t2.c1 ))
 and t1.id=t2.id;

计划:

我们也可以把union all把or进行拆分

select count(*) from t07051 t1,t07052 t2 where

exists (select 1 from t0705 t where (t.c1=t1.c1) and (t.c1=t2.c1)

union all

select 1 from t0705 t where (t.c2=t1.c1) and (t.c2=t2.c1))

and t1.id=t2.id;

计划:

从第二个改写方案进而我们可以把exists union all方式改写成exists or exists的方式

css 复制代码
select count(*) from t07051 t1,t07052 t2 where (
exists (select 1 from t0705 t where (t.c1=t1.c1) and (t.c1=t2.c1)
)or exists (
select 1 from t0705 t where (t.c2=t1.c1) and (t.c2=t2.c1)))
and t1.id=t2.id;

计划:

改写方案基本上比原先的or关联快上一倍。

3、总结

or关联无论是把or拆分还是整体做,并不高效,在优化中我们一般是把or用union all去做等价替代。这里有几点要注意

(1) 下面情况不等价

css 复制代码
create table t1 (id int primary key,c1 varchar2(20),c2 varchar2(20));
insert into t1 values(1,'AA','BB');
insert into t1 values(2,'CC','AA');

commit;
drop table if exists t2;
create table t2 (id int primary key,c1 varchar2(20),c2 varchar2(20));
insert into t2 values(1,'AA','BB');
insert into t2 values(2,'AB','BA');
insert into t2 values(3,'BB','AA');
commit;
drop table if exists t3;
create table t3 (id int primary key,c1 varchar2(20),c2 varchar2(20));
insert into t3 values(1,'AA','BB');
insert into t3 values(2,'AB','BA');
insert into t3 values(3,'BB','AA');
commit;

select t1.c1 as t1c1,t2.c1 as t2c1,t3.c1 as t3c1 from t1,t2,t3 where t2.id=t3.id and (t1.c1=t2.c1 or t1.c2=t2.c1)  
AA	AA	AA
AA	BB	BB
CC	AA	AA
不等价于
select t1.c1 as t1c1,t2.c1 as t2c1,t3.c1 as t3c1 from (select c1 from t1
union all
select c2  from t1
) t1
,t2,t3 where t2.id=t3.id and t1.c1=t2.c1
AA	AA	AA
BB	BB	BB
AA	AA	AA

这种情况是要显示t1.c1情况的,下面的c1是union all是把t1.c1和t1.c2合并了,并不是实际意义上的c1了,因此这里需要加一列把真正的c1列查询出来

所以改写:

css 复制代码
select t1c1 as t1c1,t2.c1 as t2c1,t3.c1 as t3c1 from (select c1,c1 as t1c1 from t1
union all
select c2,c1 as t1c1  from t1
) t1
,t2,t3 where t2.id=t3.id and t1.c1=t2.c1
AA	AA	AA
AA	BB	BB
CC	AA	AA

有时or改写成union all需要注意结果要的是什么,另外就是union all它不做去重,这一点与or有点区别。

(2)or运算优先级比and的低,如果要优先计算,记得加()。

(3)改写方案最终是不要受到参数的影响。

相关推荐
zcn1262 个月前
标量子查询优化(二)
sql优化改写