SQL在DBA手里-改写篇

背景

最近运营需要做月报汇总交易情况，之前一直是他们手工出的数据，他们想做成月初自动发送邮件，从而减轻他们的工作量。于是他们提供SQL我们在邮件服务器配置做定时发送任务。

表介绍（表及字段已做脱敏处理）

trans_profits
交易毛利表：仅记录每天毛利数据
trans_offline_order
线下订单表：记录线下订单情况
trans_online_order
线上订单表：记录线上订单情况

SQL "变装"过程

原始：SQL

缺点：不易读，查询套子查询
查询解读：将线下及线上订单"交易笔数""交易金额"数据合并再与毛利表按"交易日期"关联查询，显示："交易笔数"，"交易金额"，"毛利金额"，"月份"
--注：线上线下订单表为原始数据，毛利表为汇算后的数据，因此毛利表无需count(*)统计交易笔数；

select d.month as 月,
round(s.count/10000 , 2) ||'万' as 交易笔数,
round(s.amt/10000 , 2) ||'万' as 交易金额,
round(d.profits_amt/10000 , 2) ||'万' as 毛利金额
from (SELECT to_char(trans_time, 'yyyyMM') as month,
sum(profits_amt) as profits_amt
FROM trans_profits -- 交易毛利表
where trans_time >= to_date('20240101', 'yyyyMMdd')
and trans_time < to_date('20241231', 'yyyyMMdd')
group by to_char(trans_time, 'yyyyMM')) d
left join (select month,
sum(count) as count,
sum(amt) as amt
from (SELECT to_char(trans_time, 'yyyyMM') as month,
count(1) as count,
sum(trans_amt) as amt
FROM trans_offline_order -- 线下订单表
where trans_cd = '00'
and trans_time >= to_TIMESTAMP('20240101', 'yyyyMMdd')
and trans_time < to_TIMESTAMP('20241231', 'yyyyMMdd')
group by to_char(trans_time, 'yyyyMM')
union all
SELECT to_char(trans_time, 'yyyyMM') as month,
count(1) as count,
sum(trans_amt) AS amt
FROM trans_online_order -- 线上订单表
WHERE trans_type IN ('01', '02')
and trans_cd = '00'
and trans_time >= to_TIMESTAMP('20240101', 'yyyyMMdd')
and trans_time < to_TIMESTAMP('20241231', 'yyyyMMdd')
group by to_char(trans_time, 'yyyyMM')) t
group by month) s
on d.month = s.month
order by 1;

"变装"：SQL

优点：查询简洁易懂
查询解读：将线上、线下及毛利表进行数据合并，其中计算"交易笔数"线上、线下虚拟出列为ct 值为1标记，毛利表因为不需要记得笔数因此ct值标记为0，最后汇总时用sum(ct)列即可得到"交易笔数"。

SELECT
substr(t.trans_time,0,6) 月,
round(sum(ct) /10000 , 2) ||'万' as 交易笔数,
round(sum(trans_amt)/10000 , 2) ||'万' as 交易金额,
round(sum(profits_amt)/10000 , 2) ||'万' as 毛利金额
FROM (
SELECT to_char(trans_time,'yyyymmdd') trans_time,
1 ct,
trans_amt,
0 profits_amt
FROM trans_offline_order -- 线下订单表
where trans_cd = '00'
and trans_time >= to_TIMESTAMP('20240101', 'yyyyMMdd')
and trans_time < to_TIMESTAMP('20241231', 'yyyyMMdd')
union all
SELECT to_char(trans_time,'yyyymmdd') trans_time,
1 ct,
trans_amt,
0 profits_amt
FROM trans_online_order -- 线上订单表
WHERE trans_type IN ('01', '02')
and trans_cd = '00'
and trans_time >= to_TIMESTAMP('20240101', 'yyyyMMdd')
and trans_time < to_TIMESTAMP('20241231', 'yyyyMMdd')
union all
SELECT to_char(trans_time,'yyyymmdd') trans_time,
0 ct,
0 trans_amt,
profits_amt
FROM trans_profits -- 交易毛利表
where trans_time >= to_date('20240101', 'yyyyMMdd')
and trans_time < to_date('20241231', 'yyyyMMdd')
) t
GROUP BY substr(t.trans_time,0,6)
ORDER BY 1 ;

执行计划对比

Statistics 资源消耗相同；
| Rows | Bytes | Cost (%CPU)| Time | 这几项明显"变装"后更优于原SQL写法，原SQL写法甚至还用到了TempSpc的耗；
执行时间"变装"后慢了10+ms但影响不大；
-- 注（疑惑）：明明从执行计划来分析"变装"后的SQL更优，为啥会变慢了呢？

总结

SQL在其它部门的作用是以实现需求为主，但在DBA手里需要考虑在不改变需求结果的前提下，要让SQL更具有可读性及良好的性能。