背景
- 有三张表ltd1 、ltd0051和、ltd0011
- ltd1作为主表,左关联 ltd0051和ltd0011
- 如果ltd0051有两条重复数据、td0011有两条重复数据,左关联之后就会得到4条,同时ltd0051和ltd0011这两条数据都是正确,基于主键我们需要将两个相同主键的数据合并成一条
ltd0051和ltd0011 中是存在重复数据的,ltd0051重复数据的判定规则:planid和batchno,MatCode相同的判定为重复数据
ltd0011重复数据的判定规则:planid和batchno,MatCode相同的判定为重复数据
第一步:实现需求
使用了两个子查询,对重复数据进行加和,得到两个新表
sql
SELECT
ltd1.plan_id_out,
ltd1.batchno_out,
ltd1.lot_id_out,
ltd1.material_code_out,
ltd1.material_name_out,
ltd1.equip_id_out,
ltd1.pro_date_out,
ltd1.shift_id_out,
ltd1.weight_out,
ltd1.state_out,
CASE WHEN t11.MATCODE IS NOT NULL THEN t11.MATCODE ELSE t51.MATCODE END AS material_code_in,
CASE WHEN t11.MATNAME IS NOT NULL THEN t11.MATNAME ELSE t51.MATNAME END AS material_name_in,
CASE WHEN t11.SWeight IS NOT NULL THEN t11.SWeight ELSE t51.ActWT END AS weight_in,
t11.lotid AS lot_id_in,
ltin.orderno AS plan_id_in,
ltin.JDAT AS pro_date_in,
ltin.JSHT AS shift_id_in,
ltin.JMCH AS equip_id_in
FROM
(
SELECT
ORDERNO AS plan_id_out,
LOTID AS lot_id_out,
JDAT AS pro_date_out,
CAST(JSHT AS INT) AS shift_id_out,
JMCH AS equip_id_out,
ITNBR AS material_code_out,
ITDSC AS material_name_out,
'' AS unit_cost_out,
jwt AS weight_out,
'' AS cost_out,
CAST(VALUE AS INT) AS batchno_out,
CASE WHEN STATE IN (4, 5) THEN '空走' ELSE '非空走' END AS state_out
FROM
PLMES.dbo.LTD0001
CROSS APPLY STRING_SPLIT(CNUMNEW, ',')
WHERE
DIV <> 'XL'
) LTD1
LEFT JOIN (
SELECT
planid,
batchno,
MATCODE,
SUM(ActWT) AS ActWT,
MAX(MatName) AS MatName
FROM
ltd0051
GROUP BY
planid, batchno, MATCODE
) t51 ON LTD1.plan_id_out = t51.planid AND ltd1.batchno_out = t51.batchno
LEFT JOIN (
SELECT
planid,
batchno,
MATCODE,
SUM(CAST(SWeight AS DECIMAL(20, 10))) AS SWeight,
MAX(MatName) AS MatName,
lotid
FROM
ltd0011
GROUP BY
planid, batchno, MATCODE, lotid
) t11 ON LTD1.plan_id_out = t11.planid AND ltd1.batchno_out = t11.batchno AND t51.MATCODE = t11.MATCODE
LEFT JOIN ltd0001 ltin ON t11.LOTID = ltin.lotid
WHERE
LTD1.pro_date_out = '2024-05-20'
AND LTD1.shift_id_out = 1
优化查询效率
- 查询出来了,但是很慢
- 索引优化:确保在 ltd0051 和 ltd0011 表的 planid、batchno 和 MATCODE 列上有适当的索引。但是这不是我们自己的表,无法实现。
- 减少数据量:在子查询中添加过滤条件,减少需要处理的数据量。但是,我们是根据主表ltd1作为筛选条件的,无法对子表进行条件查询
- CTE (Common Table Expressions):使用 WITH 语句创建两个 CTE (t51_agg 和 t11_agg) 来存储聚合后的数据。
sql
-- 取 产出数据
WITH LTD1 AS (
SELECT
ORDERNO AS plan_id_out,
LOTID AS lot_id_out,
JDAT AS pro_date_out,
CAST ( JSHT AS INT ) AS shift_id_out,
JMCH AS equip_id_out,
ITNBR AS material_code_out,
ITDSC AS material_name_out,
'' AS unit_cost_out,
jwt AS weight_out,
'' AS cost_out,
CAST ( VALUE AS INT ) AS batchno_out,
CASE
WHEN STATE IN ( 4, 5 ) THEN
'空走' ELSE '非空走'
END AS state_out
FROM
PLMES.dbo.LTD0001 CROSS APPLY STRING_SPLIT ( CNUMNEW, ',' )
WHERE
DIV <> 'XL'
AND jdat = '2024-05-20'
AND jsht = 1
),
-- 关联得到 称重数据
t51_agg AS (
SELECT
planid,
batchno,
MATCODE,
SUM ( ActWT ) AS ActWT,
MAX ( MatName ) AS MatName
FROM
ltd0051
GROUP BY
planid,
batchno,
MATCODE
),
-- 关联得到 追溯数据
t11_agg AS (
SELECT
planid,
batchno,
MATCODE,
SUM ( CAST ( SWeight AS DECIMAL ( 20, 10 ) ) ) AS SWeight,
MAX ( MatName ) AS MatName,
lotid
FROM
ltd0011
GROUP BY
planid,
batchno,
MATCODE,
lotid
) SELECT
ltd1.plan_id_out,
ltd1.batchno_out,
ltd1.lot_id_out,
ltd1.material_code_out,
ltd1.material_name_out,
ltd1.equip_id_out,
ltd1.pro_date_out,
ltd1.shift_id_out,
ltd1.weight_out,
ltd1.state_out,
CASE
WHEN t11.MATCODE IS NOT NULL THEN
t11.MATCODE ELSE t51.MATCODE
END AS material_code_in,
CASE
WHEN t11.MATNAME IS NOT NULL THEN
t11.MATNAME ELSE t51.MATNAME
END AS material_name_in,
CASE
WHEN t11.SWeight IS NOT NULL THEN
t11.SWeight ELSE t51.ActWT
END AS weight_in,
t11.lotid AS lot_id_in,
ltin.orderno AS plan_id_in,
ltin.JDAT AS pro_date_in,
ltin.JSHT AS shift_id_in,
ltin.JMCH AS equip_id_in
FROM
LTD1
LEFT JOIN t51_agg t51 ON LTD1.plan_id_out = t51.planid
AND ltd1.batchno_out = t51.batchno
LEFT JOIN t11_agg t11 ON LTD1.plan_id_out = t11.planid
AND ltd1.batchno_out = t11.batchno
AND t51.MATCODE = t11.MATCODE
LEFT JOIN ltd0001 ltin ON t11.LOTID = ltin.lotid;