sql
SELECT
b.COMPANY_ID,
a.TRD_DT,
d.NET_PROF,
d.OP_TINC,
d.END_DT,
ROW_NUMBER() OVER (PARTITION BY b.COMPANY_ID, a.TRD_DT ORDER BY d.END_DT DESC) AS rn
FROM STK_HK_EXCH_QUOT a
JOIN BASE_SECURITY_INFO b ON a.SECU_ID = b.SECURITY_ID
JOIN COO_HK_INC_STMT d ON b.COMPANY_ID = d.CORP_ID
WHERE d.END_DT <= a.TRD_DT
AND d.DATA_FLAG_CD = 102
AND d.NET_PROF IS NOT NULL AND d.OP_TINC IS NOT NULL ;
这段OracleSQL逻辑 请帮我加上大小表的关联hint优化,其中 a表和b表是百万数量级的,d表是十万数量级,a表有800w+,b表有200w+
针对你的数据量级(a表800w+,b表200w+,d表10w+),这里提供几个优化方案:
方案1:使用LEADING和USE_HASH提示(推荐)
SELECT /*+ LEADING(a b d) USE_HASH(a b) USE_HASH(a d) */
b.COMPANY_ID,
a.TRD_DT,
d.NET_PROF,
d.OP_TINC,
d.END_DT,
ROW_NUMBER() OVER (PARTITION BY b.COMPANY_ID, a.TRD_DT ORDER BY d.END_DT DESC) AS rn
FROM STK_HK_EXCH_QUOT a
JOIN BASE_SECURITY_INFO b ON a.SECU_ID = b.SECURITY_ID
JOIN COO_HK_INC_STMT d ON b.COMPANY_ID = d.CORP_ID
WHERE d.END_DT <= a.TRD_DT
AND d.DATA_FLAG_CD = 102
AND d.NET_PROF IS NOT NULL
AND d.OP_TINC IS NOT NULL;
方案2:更精细的关联控制
SELECT /*+ LEADING(a) USE_HASH(b) USE_HASH(d) SWAP_JOIN_INPUTS(d) */
b.COMPANY_ID,
a.TRD_DT,
d.NET_PROF,
d.OP_TINC,
d.END_DT,
ROW_NUMBER() OVER (PARTITION BY b.COMPANY_ID, a.TRD_DT ORDER BY d.END_DT DESC) AS rn
FROM STK_HK_EXCH_QUOT a
JOIN BASE_SECURITY_INFO b ON a.SECU_ID = b.SECURITY_ID
JOIN COO_HK_INC_STMT d ON b.COMPANY_ID = d.CORP_ID
WHERE d.END_DT <= a.TRD_DT
AND d.DATA_FLAG_CD = 102
AND d.NET_PROF IS NOT NULL
AND d.OP_TINC IS NOT NULL;
方案3:针对分区表的优化(如果相关表有分区)
SELECT /*+ LEADING(a) FULL(a) FULL(b) FULL(d) USE_HASH(a b) USE_HASH(a d) PARALLEL(4) */
b.COMPANY_ID,
a.TRD_DT,
d.NET_PROF,
d.OP_TINC,
d.END_DT,
ROW_NUMBER() OVER (PARTITION BY b.COMPANY_ID, a.TRD_DT ORDER BY d.END_DT DESC) AS rn
FROM STK_HK_EXCH_QUOT a
JOIN BASE_SECURITY_INFO b ON a.SECU_ID = b.SECURITY_ID
JOIN COO_HK_INC_STMT d ON b.COMPANY_ID = d.CORP_ID
WHERE d.END_DT <= a.TRD_DT
AND d.DATA_FLAG_CD = 102
AND d.NET_PROF IS NOT NULL
AND d.OP_TINC IS NOT NULL;
关键优化点说明:
-
LEADING提示 :强制指定表的连接顺序
- 建议从最大的a表开始,然后连接b表,最后连接d表
-
USE_HASH提示 :对大数据量连接使用哈希连接
- 哈希连接适合大数据量的等值连接
-
SWAP_JOIN_INPUTS(d) :让d表作为哈希连接的构建表
- 因为d表最小(10w+),适合作为哈希表的构建端
-
索引建议(如果性能还不够):
-- 为a表创建索引
CREATE INDEX IDX_STK_HK_EXCH_QUOT_SECU_ID ON STK_HK_EXCH_QUOT(SECU_ID, TRD_DT);-- 为b表创建索引
CREATE INDEX IDX_BASE_SEC_INFO_SEC_ID ON BASE_SECURITY_INFO(SECURITY_ID, COMPANY_ID);-- 为d表创建索引
CREATE INDEX IDX_COO_HK_INC_STMT_CORP ON COO_HK_INC_STMT(CORP_ID, END_DT, DATA_FLAG_CD)
WHERE NET_PROF IS NOT NULL AND OP_TINC IS NOT NULL;
其他建议:
- 先测试执行计划 :使用
EXPLAIN PLAN FOR查看不同提示的执行计划 - 考虑分步处理:如果结果集仍然很大,可以考虑分步计算
- 定期收集统计信息:确保统计信息准确
- 使用并行查询 :如果硬件资源充足,可以考虑添加
PARALLEL提示
建议先用方案1测试,观察执行计划,再根据实际情况调整。