WITH AS (CTE) 优化改写原理
WITH AS 即公用表表达式(CTE, Common Table Expression) ,核心原理是:将子查询结果集临时存储在内存或临时段中,可被主查询多次引用。
一、核心工作原理
sql
-- 原始写法:子查询执行2次
SELECT * FROM (SELECT * FROM orders WHERE status='ACTIVE') a
UNION ALL
SELECT * FROM (SELECT * FROM orders WHERE status='ACTIVE') b;
-- CTE写法:只执行1次,结果复用
WITH active_orders AS (
SELECT * FROM orders WHERE status='ACTIVE'
)
SELECT * FROM active_orders
UNION ALL
SELECT * FROM active_orders;
执行流程:
-
物化阶段:执行CTE中的查询,将结果集存入临时工作区
-
引用阶段:主查询中每次引用CTE,直接读取临时结果集
-
清理阶段:主查询执行完毕后,释放临时空间
二、关键优化场景
场景1:多表重复扫描
sql
-- ❌ 低效:同一张表扫描3次
SELECT
(SELECT COUNT(*) FROM sales WHERE year=2024) AS total,
(SELECT AVG(amount) FROM sales WHERE year=2024) AS avg_amount,
(SELECT SUM(amount) FROM sales WHERE year=2024) AS sum_amount;
-- ✅ 高效:只扫描1次
WITH sales_2024 AS (
SELECT amount FROM sales WHERE year=2024
)
SELECT
(SELECT COUNT(*) FROM sales_2024) AS total,
(SELECT AVG(amount) FROM sales_2024) AS avg_amount,
(SELECT SUM(amount) FROM sales_2024) AS sum_amount;
场景2:递归查询(CTE独有功能)
sql
-- 组织架构树查询
WITH RECURSIVE org_tree AS (
-- 锚点:根节点
SELECT id, name, parent_id, 1 AS level
FROM organization WHERE parent_id IS NULL
UNION ALL
-- 递归:向下遍历
SELECT o.id, o.name, o.parent_id, t.level + 1
FROM organization o
INNER JOIN org_tree t ON o.parent_id = t.id
)
SELECT * FROM org_tree;
场景3:复杂分页优化
sql
-- 先CTE过滤和排序,再分页
WITH ranked_data AS (
SELECT t.*, ROW_NUMBER() OVER (ORDER BY create_time DESC) AS rn
FROM huge_table t
WHERE status = 'ACTIVE'
)
SELECT * FROM ranked_data WHERE rn BETWEEN 1 AND 100;
三、优化原理深度解析
| 优化机制 | 说明 | 适用场景 |
|---|---|---|
| 结果物化 | 将中间结果存入临时表,避免重复计算 | 子查询被多次引用 |
| 查询合并 | 优化器可将CTE与外层查询合并执行计划 | 单次引用的简单CTE |
| 递归迭代 | 支持树形/图形数据的遍历查询 | 组织架构、BOM物料清单 |
| 逻辑封装 | 将复杂逻辑拆解为多个语义清晰的步骤 | 长SQL可维护性改造 |
四、Oracle优化器特殊行为
sql
-- Oracle中可控制物化行为
WITH /*+ MATERIALIZE */ cache_data AS ( -- 强制物化
SELECT * FROM large_table WHERE condition
)
SELECT * FROM cache_data;
WITH /*+ INLINE */ cache_data AS ( -- 强制不物化,直接合并
SELECT * FROM large_table WHERE condition
)
SELECT * FROM cache_data;
Oracle优化器决策规则:
-
默认情况下,CTE只被引用一次 → 可能不物化(直接合并)
-
CTE被引用多次 → 自动物化
-
CTE结果集很大 → 物化到临时段
-
CTE结果集很小 → 物化到内存(PGA)
五、针对您原SQL的CTE改写示例
sql
WITH base_data AS (
-- 核心关联数据:一次性获取所有基础字段
SELECT
a.FLOW_NO, a.FLOW_TITLE, a.SERVICE_TYPE, a.END_TIME,
a.SEND_DEPT, a.SEND_MAN, a.SEND_TIME, a.FLOW_ID,
b.DISMANTLE_TYPE, b.TASK_DESC, b.START_TIME, b.REPLY_TIME,
b.DEAL_TIME, b.SERVICE_TYPE_ID, b.SPECIAL_TYPE_ID,
b.APP_ORDER_NUMBER, b.SERVICE_COMPANY, b.BELONG_UNIT,
b.PROJECT_NAME, b.SPECIAL_TYPE, b.ISAUTO,
b.JC_KH_SCORE, b.JC_KHSCORE_FEE, b.PLANTIME, b.FINISHTIME,
b.DEGREE, b.PRICE, b.TOTAL, b.MONEY, b.BUDGET_DEPT, b.DUTY_DEPT,
b.ISJK, b.SERVICE_COMPANY_ID
FROM t_BPM_FORM_INFO a
INNER JOIN T_BNS_SERVICE_BYNUM_jc b
ON a.FLOW_ID = b.FLOW_ID
AND a.FLOW_NO = b.APP_ORDER_NUMBER
WHERE a.SEND_TIME >= TO_DATE('2021/01/01 00:00:00', 'yyyy/mm/dd hh24:mi:ss')
AND a.APP_TYPE = 'netMaintain'
AND a.CURRENT_STATE <> '4'
AND b.SPECIAL_TYPE_ID <> '009'
AND b.SERVICE_COMPANY_ID <> '2521'
AND b.IS_ZH IS NULL
),
-- 代维回单时间子查询(只执行1次)
deal_time_info AS (
SELECT
DISORDERNUMBER,
TO_CHAR(MAX(CREATIONTIME), 'yyyy-mm-dd hh24:mi:ss') AS max_creation_time
FROM authdb.ufp_general_info_table
WHERE NODECODE = 'clz'
GROUP BY DISORDERNUMBER
),
-- 现场完成时间子查询
field_time_info AS (
SELECT FLOW_ID, TO_CHAR(ETIME, 'yyyy-mm-dd hh24:mi:ss') AS etime_str
FROM df_task
),
-- 当前环节子查询
current_step_info AS (
SELECT JOBCODE, ACTDEFNAME
FROM authdb.ufp_pending_worke
WHERE (JOBCODE, ACTDEFNAME) IN (
SELECT JOBCODE, MIN(ACTDEFNAME)
FROM authdb.ufp_pending_worke
GROUP BY JOBCODE
)
)
-- 主查询:直接JOIN CTE结果
SELECT
b.FLOW_NO AS 工单编号,
b.FLOW_TITLE AS 工单主题,
CASE WHEN b.SERVICE_TYPE IN ('室分退网', '基站退网') THEN b.DISMANTLE_TYPE ELSE '' END AS 整站拆除或整逻辑站拆除,
b.TASK_DESC AS 派单描述,
TO_DATE(b.START_TIME, 'yyyy/MM/dd hh24:mi:ss') AS 派单时间,
TO_DATE(b.REPLY_TIME, 'yyyy/MM/dd hh24:mi:ss') AS 处理时限,
COALESCE(b.DEAL_TIME, d.max_creation_time) AS 代维回单时间,
CASE
WHEN b.SERVICE_TYPE_ID = '14' THEN f.etime_str
WHEN b.SPECIAL_TYPE_ID = '002' AND b.SERVICE_TYPE_ID = '13' THEN b.DEAL_TIME
ELSE '--'
END AS 现场完成时间,
-- ... 其他字段
COALESCE(c.ACTDEFNAME, '已归档') AS 当前环节,
CASE
WHEN (b.DEAL_TIME IS NULL AND b.REPLY_TIME < TO_CHAR(SYSDATE, 'yyyy-MM-dd hh24:mi:ss') AND b.ISJK IS NULL)
OR (b.DEAL_TIME IS NOT NULL AND b.REPLY_TIME < b.DEAL_TIME)
THEN '是' ELSE '否'
END AS 是否超时
FROM base_data b
LEFT JOIN deal_time_info d ON b.APP_ORDER_NUMBER = d.DISORDERNUMBER
LEFT JOIN field_time_info f ON b.APP_ORDER_NUMBER = f.FLOW_ID
LEFT JOIN current_step_info c ON b.FLOW_NO = c.JOBCODE
ORDER BY b.FLOW_NO;
六、性能对比总结
| 维度 | 原SQL(标量子查询) | CTE改写后 |
|---|---|---|
ufp_general_info_table 扫描次数 |
每行1次 | 1次(全表+聚合) |
df_task 扫描次数 |
每行1次 | 1次(全表) |
ufp_pending_worke 扫描次数 |
每行1次 | 1次(全表) |
| 适用数据量 | < 1000行 | > 10000行 |
| 可读性 | 较低 | 高(逻辑分层清晰) |
核心结论: CTE通过空间换时间 的方式,将多次重复的子查询扫描转化为单次物化+多次读取,适合子查询被频繁引用且结果集可复用的场景。