记录一次 postgresql 优化案例( volatility 自定义函数无法并行查询 )

同事最近做个金融适配项目,找我看条SQL,告知ORACLE跑1分钟,PG要跑30分钟(其实并没有这么夸张)😎, 废话不说,贴慢SQL。

慢SQL(关键信息已经加密)

复制代码
explain analyze
SELECT 
    c_qxxxxaode,
    '2023-03-22 00:00:00' AS d_cdate,
    SUM(CASE WHEN l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60) THEN 1 ELSE 0 END) AS bt5ycusts,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36) THEN 1 ELSE 0 END) AS bt3to5ycusts,
    SUM(CASE WHEN l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36) THEN 1 ELSE 0 END) AS bt3ycusts,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12) THEN 1 ELSE 0 END) AS bt1yto3ycusts,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6) THEN 1 ELSE 0 END) AS bt6to12mcusts,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3) THEN 1 ELSE 0 END) AS bt3to6mcusts,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3) THEN 1 ELSE 0 END) AS btlose3mcusts,
    SUM(CASE WHEN l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60) THEN f_qwwvddvvzz ELSE 0 END) AS bt5yshares,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36) THEN f_qwwvddvvzz ELSE 0 END) AS bt3to5yshares,
    SUM(CASE WHEN l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36) THEN f_qwwvddvvzz ELSE 0 END) AS bt3yshares,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12) THEN f_qwwvddvvzz ELSE 0 END) AS bt1to3yshares,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6) THEN f_qwwvddvvzz ELSE 0 END) AS bt6to12mshares,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3) THEN f_qwwvddvvzz ELSE 0 END) AS bt3to6mshares,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3) THEN f_qwwvddvvzz ELSE 0 END) AS btlose3mshares,
    round(AVG(months_between('2023-03-22 00:00:00', l.f_qdqdqdq)), 2) AS avgmonth,
    round(AVG(nvl(f_qwwvddvvzz, 0)), 2) AS avgshares,
    COUNT(a_daccoxxz) AS custsum,
    SUM(f_qwwvddvvzz) AS sharessum
FROM 
    stcdlbxxxxx l
WHERE 
    nvl(l.f_qwwvddvvzz, 0) > 0 AND 
    l.f_qdqdqdq <= '2023-03-22 00:00:00' AND 
    l.a_daccoxxz <> '996000000000' AND 
    c_qxxxxaode IN (SELECT c_qxxxxaode FROM  xdsfffs WHERE c_raisetype = '1')
GROUP BY 
    c_qxxxxaode;

执行计划:

复制代码
HashAggregate  (cost=1043326.49..1043332.91 rows=151 width=424) (actual time=381246.429..381246.640 rows=150 loops=1)
  Group Key: l.c_qxxxxaode
  ->  Hash Semi Join  (cost=8.78..936347.95 rows=301348 width=38) (actual time=0.057..30237.230 rows=30056793 loops=1)
        Hash Cond: (l.c_qxxxxaode = tfundinfo.c_qxxxxaode)
        ->  Seq Scan on stcdlbxxxxx l  (cost=0.00..906618.70 rows=10044941 width=38) (actual time=0.008..25908.814 rows=30157190 loops=1)
"              Filter: ((NVL(f_qwwvddvvzz, '0'::numeric) > '0'::numeric) AND (f_qdqdqdq <= '2023-03-22 00:00:00'::timestamp without time zone) AND (a_daccoxxz <> '996000000000'::text))"
              Rows Removed by Filter: 4842810
        ->  Hash  (cost=6.91..6.91 rows=150 width=8) (actual time=0.046..0.047 rows=150 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 14kB
              ->  Index Only Scan using idx_tfundinfo_fundcode on  xdsfffs  (cost=0.28..6.91 rows=150 width=8) (actual time=0.021..0.037 rows=150 loops=1)
                    Index Cond: (c_raisetype = '1'::text)
                    Heap Fetches: 0
Planning Time: 0.512 ms
Execution Time: 381246.699 ms
复制代码
select count(1) from stcdlbxxxxx;
  count   
----------
 35000000
(1 row)

stcdlbxxxxx 表数据量 3500W,数据量不算大,这条SQL主要含义应该是对 fact 表做统计,一堆聚合函数,fact 表是数仓概念,如果不明白可以去百度。

这条SQL主要慢在HashAggregate这个节点上,Hash Semi Join 花了 30237.230毫秒(30秒),然后到上面 Group Key: l.c_qxxxxaode 分个组以后,HashAggregate 直接飙到 381246.640毫秒(6.3分钟)。

其实这条SQL最理想的状态是走 HashAggregate + parallel 的计划,但是优化器并没有这样做,我怀疑可能是SQL写法导致优化器没有走并行,没多想,直接改写了一版的SQL让同事去试试。****

改写版本SQL:(PG独占的语法,聚合函数新增 FILTER 属性,代替 case when 写法)

复制代码
explain analyze
SELECT  c_qxxxxaode,
        '2023-03-22 00:00:00' AS d_cdate,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60)) AS bt5ycusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3to5ycusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3ycusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12)) AS bt1yto3ycusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6)) AS bt6to12mcusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3)) AS bt3to6mcusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3)) AS btlose3mcusts,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60)) AS bt5yshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3to5yshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3yshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12)) AS bt1to3yshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6)) AS bt6to12mshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3)) AS bt3to6mshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3)) AS btlose3mshares,
        ROUND(AVG(months_between('2023-03-22 00:00:00', l.f_qdqdqdq)), 2) AS avgmonth,
        ROUND(AVG(NVL(f_qwwvddvvzz, 0)), 2) AS avgshares,
        COUNT(a_daccoxxz) AS custsum,
        SUM(f_qwwvddvvzz) AS sharessum
FROM stcdlbxxxxx l
WHERE NVL(l.f_qwwvddvvzz, 0) > 0
  AND l.f_qdqdqdq <= '2023-03-22 00:00:00'
  AND l.a_daccoxxz <> '996000000000'
  AND c_qxxxxaode IN (SELECT c_qxxxxaode FROM  xdsfffs WHERE c_raisetype = '1')
GROUP BY c_qxxxxaode;

改写后执行计划:

复制代码
HashAggregate  (cost=1043326.49..1043332.91 rows=151 width=424) (actual time=380246.621..380246.849 rows=150 loops=1)
  Group Key: l.c_qxxxxaode
  ->  Hash Semi Join  (cost=8.78..936347.95 rows=301348 width=38) (actual time=0.055..29983.463 rows=30056793 loops=1)
        Hash Cond: (l.c_qxxxxaode = tfundinfo.c_qxxxxaode)
        ->  Seq Scan on stcdlbxxxxx l  (cost=0.00..906618.70 rows=10044941 width=38) (actual time=0.008..25415.490 rows=30157190 loops=1)
"              Filter: ((NVL(f_qwwvddvvzz, '0'::numeric) > '0'::numeric) AND (f_qdqdqdq <= '2023-03-22 00:00:00'::timestamp without time zone) AND (a_daccoxxz <> '996000000000'::text))"
              Rows Removed by Filter: 4842810
        ->  Hash  (cost=6.91..6.91 rows=150 width=8) (actual time=0.043..0.044 rows=150 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 14kB
              ->  Index Only Scan using idx_tfundinfo_fundcode on  xdsfffs  (cost=0.28..6.91 rows=150 width=8) (actual time=0.018..0.035 rows=150 loops=1)
                    Index Cond: (c_raisetype = '1'::text)
                    Heap Fetches: 0
Planning Time: 0.533 ms
Execution Time: 380246.909 ms

当时我看到这个改写后执行计划就无语了😶,和原SQL的计划一毛一样,直接把我整懵逼,怀疑是否现场环境没开并行参数,但是被同事告知是已经设置了并行参数。

复制代码
qtbg=> show max_parallel_workers_per_gather 
qtbg-> ;
 max_parallel_workers_per_gather 
---------------------------------
 16
(1 row)

了解到情况以后开始研究SQL,发现使用了两个自定义函数,add_months,months_between(我们研发写的函数、为了兼容ORACLE),然后看了下这两个函数的信息。

复制代码
qtbg=> \df+ add_months
                                                                                                                       List of functions
 Schema |    Name    | Result data type |        Argument data types        | Type | Volatility | Parallel | Owner  | Security | Access privileges | Language |                                   Source code                                    | Description 
--------+------------+------------------+-----------------------------------+------+------------+----------+--------+----------+-------------------+----------+----------------------------------------------------------------------------------+-------------
 sys    | add_months | pg_catalog.date  | pg_catalog.date, boolean          | func | immutable  | safe     | system | invoker  |                   | c        | add_months_bool                                                                  | 
 sys    | add_months | pg_catalog.date  | pg_catalog.date, integer          | func | immutable  | safe     | system | invoker  |                   | c        | add_months                                                                       | 
 sys    | add_months | date             | timestamp with time zone, integer | func | immutable  | safe     | system | invoker  |                   | sql      |  select (add_months($1::pg_catalog.date, $2)  + $1::pg_catalog.time)::sys.date;  | 
(3 rows)

qtbg=> \df+ months_between
                                                                                                                                                           List of functions
 Schema |      Name      | Result data type |                Argument data types                 | Type | Volatility | Parallel | Owner  | Security | Access privileges | Language |                                                             Source code                                                             | Descri
ption 
--------+----------------+------------------+----------------------------------------------------+------+------------+----------+--------+----------+-------------------+----------+-------------------------------------------------------------------------------------------------------------------------------------+-------
------
 sys    | months_between | double precision | date, date                                         | func | volatile   | unsafe   | system | invoker  |                   | plsql    |                                                                                                                                    +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |   begin                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    case when                                                                                                                       +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (last_day($1) = $1 and last_day($2) = $2)                                                                                     +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     or                                                                                                                             +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(day from $1) = extract(day from $2))                                                                                 +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    then                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     return (select (extract(years from $1)::int * 12 - extract(years from $2)::int * 12)::float +                                  +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(month from $1)::int - extract(month from $2)::int)::float);                                                          +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    else                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     return (select (extract(years from $1)::int * 12 - extract(years from $2)::int * 12)::float +                                  +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(month from $1)::int - extract(month from $2)::int)::float +                                                          +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(day from $1)::int - extract(day from $2)::int)/31::float +                                                           +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(hour from $1)::int * 3600 + extract(minutes from $1)::int * 60 + extract(seconds from $1)::int)/(3600*24*31)::float -+| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(hour from $2)::int * 3600 + extract(minutes from $2)::int * 60 + extract(seconds from $2)::int)/(3600*24*31)::float);+| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    end case;                                                                                                                       +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |   end;                                                                                                                             +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |                                                                                                                                     | 
 sys    | months_between | double precision | timestamp with time zone, timestamp with time zone | func | volatile   | unsafe   | system | invoker  |                   | plsql    |                                                                                                                                    +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |   begin                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    case when                                                                                                                       +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (last_day($1) = $1 and last_day($2) = $2)                                                                                     +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     or                                                                                                                             +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(day from $1) = extract(day from $2))                                                                                 +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    then                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     return (select (extract(years from $1)::int * 12 - extract(years from $2)::int * 12)::float +                                  +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(month from $1)::int - extract(month from $2)::int)::float);                                                          +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    else                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     return (select (extract(years from $1)::int * 12 - extract(years from $2)::int * 12)::float +                                  +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(month from $1)::int - extract(month from $2)::int)::float +                                                          +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(day from $1)::int - extract(day from $2)::int)/31::float +                                                           +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(hour from $1)::int * 3600 + extract(minutes from $1)::int * 60 + extract(seconds from $1)::int)/(3600*24*31)::float -+| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(hour from $2)::int * 3600 + extract(minutes from $2)::int * 60 + extract(seconds from $2)::int)/(3600*24*31)::float);+| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    end case;                                                                                                                       +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |   end;                                                                                                                             +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |                                                                                                                                     | 
(2 rows)

当时就给我整崩溃了,研发真的太坑啦😂😂😂。

months_between 函数居然还是 volatile(不稳定)状态,我一直以为研发写好的ORACLE 兼容函数都是 immutable 或者 是 stable 状态,所以刚开始也没往这方面想。

PG的自定义函数有三种状态,volatile、immutable 、stable 。

volatile 是不可以走并行的,这三种状态具体代表的是什么含义,如果不清楚的同学可以下去百度,这里就不废话了,我也怕解释不清楚。😁

和客户沟通以后,我写了一个 months_between1 函数代替原来的 months_between函数,返回结果小数点的精度可能和ORACLE有点区别,但是客户表示可以接受。

months_between1 函数:

复制代码
CREATE OR REPLACE FUNCTION months_between1(date1 DATE, date2 DATE) 
RETURNS FLOAT AS $$
DECLARE
    years_diff INT;
    months_diff INT;
    days_diff FLOAT;
BEGIN
    years_diff := EXTRACT(YEAR FROM date1) - EXTRACT(YEAR FROM date2);
    months_diff := EXTRACT(MONTH FROM date1) - EXTRACT(MONTH FROM date2);
    days_diff := EXTRACT(DAY FROM date1) - EXTRACT(DAY FROM date2);
    days_diff := days_diff / 30.0;  
    RETURN (years_diff * 12) + months_diff + days_diff;
END;
$$ LANGUAGE plpgsql IMMUTABLE;

months_between1 设置可以并行:

复制代码
ALTER FUNCTION months_between1(date, date) PARALLEL SAFE;

months_between1 函数信息:

复制代码
\df+ months_between1                                           
                                                                                                                List of functions
 Schema |      Name       | Result data type |  Argument data types   | Type | Volatility | Parallel | Owner | Security | Access privileges | Language |                                Source code                                | Description 
--------+-----------------+------------------+------------------------+------+------------+----------+-------+----------+-------------------+----------+---------------------------------------------------------------------------+-------------
 scott  | months_between1 | double precision | date1 date, date2 date | func | immutable  | safe     | qtbg  | invoker  |                   | plpgsql  |                                                                          +| 
        |                 |                  |                        |      |            |          |       |          |                   |          | DECLARE                                                                  +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     years_diff INT;                                                      +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     months_diff INT;                                                     +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     days_diff FLOAT;                                                     +| 
        |                 |                  |                        |      |            |          |       |          |                   |          | BEGIN                                                                    +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     years_diff := EXTRACT(YEAR FROM date1) - EXTRACT(YEAR FROM date2);   +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     months_diff := EXTRACT(MONTH FROM date1) - EXTRACT(MONTH FROM date2);+| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     days_diff := EXTRACT(DAY FROM date1) - EXTRACT(DAY FROM date2);      +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     days_diff := days_diff / 30.0;                                       +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     RETURN (years_diff * 12) + months_diff + days_diff;                  +| 
        |                 |                  |                        |      |            |          |       |          |                   |          | END;                                                                     +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |                                                                           | 
(1 row)

最后执行SQL测试效率:

复制代码
 1 explain analyze
 2 SELECT  c_qxxxxaode,
 3         '2023-03-22 00:00:00' AS d_cdate,
 4         COUNT(*) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60)) AS bt5ycusts,
 5         COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3to5ycusts,
 6         COUNT(*) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3ycusts,
 7         COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12)) AS bt1yto3ycusts,
 8         COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6)) AS bt6to12mcusts,
 9         COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3)) AS bt3to6mcusts,
10         COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3)) AS btlose3mcusts,
11         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60)) AS bt5yshares,
12         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3to5yshares,
13         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3yshares,
14         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12)) AS bt1to3yshares,
15         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6)) AS bt6to12mshares,
16         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3)) AS bt3to6mshares,
17         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3)) AS btlose3mshares,
18         ROUND(AVG(months_between1('2023-03-22 00:00:00', l.f_qdqdqdq)), 2) AS avgmonth,
19         ROUND(AVG(NVL(f_qwwvddvvzz, 0)), 2) AS avgshares,
20         COUNT(a_daccoxxz) AS custsum,
21         SUM(f_qwwvddvvzz) AS sharessum
22 FROM stcdlbxxxxx l
23 WHERE NVL(l.f_qwwvddvvzz, 0) > 0
24   AND l.f_qdqdqdq <= '2023-03-22 00:00:00'
25   AND l.a_daccoxxz <> '996000000000'
26   AND c_qxxxxaode IN (SELECT c_qxxxxaode FROM  xdsfffs WHERE c_raisetype = '1')
27 GROUP BY c_qxxxxaode;

执行计划:

复制代码
Finalize GroupAggregate  (cost=406085.34..424225.64 rows=151 width=424) (actual time=9533.754..16818.165 rows=150 loops=1)
  Group Key: l.c_qxxxxaode
  ->  Gather Merge  (cost=406085.34..424155.80 rows=906 width=392) (actual time=9484.778..16812.958 rows=1050 loops=1)
        Workers Planned: 6
        Workers Launched: 6
        ->  Partial GroupAggregate  (cost=405085.25..423045.59 rows=151 width=392) (actual time=9383.774..16482.619 rows=150 loops=7)
              Group Key: l.c_qxxxxaode
              ->  Sort  (cost=405085.25..405210.81 rows=50225 width=38) (actual time=9358.407..10328.135 rows=4293828 loops=7)
                    Sort Key: l.c_qxxxxaode
                    Sort Method: quicksort  Memory: 461463kB
                    Worker 0:  Sort Method: quicksort  Memory: 452754kB
                    Worker 1:  Sort Method: quicksort  Memory: 452822kB
                    Worker 2:  Sort Method: quicksort  Memory: 451891kB
                    Worker 3:  Sort Method: quicksort  Memory: 448327kB
                    Worker 4:  Sort Method: quicksort  Memory: 454387kB
                    Worker 5:  Sort Method: quicksort  Memory: 453623kB
                    ->  Hash Semi Join  (cost=8.78..401163.65 rows=50225 width=38) (actual time=0.219..4369.485 rows=4293828 loops=7)
                          Hash Cond: (l.c_qxxxxaode = tfundinfo.c_qxxxxaode)
                          ->  Parallel Seq Scan on stcdlbxxxxx l  (cost=0.00..396201.45 rows=1674157 width=38) (actual time=0.017..3948.943 rows=4308170 loops=7)
"                                Filter: ((NVL(f_qwwvddvvzz, '0'::numeric) > '0'::numeric) AND (f_qdqdqdq <= '2023-03-22 00:00:00'::timestamp without time zone) AND (a_daccoxxz <> '996000000000'::text))"
                                Rows Removed by Filter: 691830
                          ->  Hash  (cost=6.91..6.91 rows=150 width=8) (actual time=0.164..0.173 rows=150 loops=7)
                                Buckets: 1024  Batches: 1  Memory Usage: 14kB
                                ->  Index Only Scan using idx_tfundinfo_fundcode on  xdsfffs  (cost=0.28..6.91 rows=150 width=8) (actual time=0.072..0.151 rows=150 loops=7)
                                      Index Cond: (c_raisetype = '1'::text)
                                      Heap Fetches: 0
Planning Time: 0.588 ms
Execution Time: 16851.196 ms

最后可以看到原来 6分钟的SQL,现在已经 16.8秒就能跑出结果,研发真的太坑了。🤣🤣🤣🤣

其实这条SQL完全不用改成 (聚合函数新增 FILTER 属性)这种写法,原来的SQL只要把 months_between 函数替换成 months_between1,一样也可以走并行。

但是为了更好看,还是把这种新的写法提交给了客户。😁😁 😁 😁 😁

相关推荐
别来无恙1496 小时前
如何用 SQL 找到最受欢迎的用户?
数据库·sql·mysql
Cyanto6 小时前
Mybatis简单架构之静态SQL语句
sql·架构·mybatis
JKIT沐枫7 小时前
VB.NET与SQL连接问题解决方案
数据库·sql·.net
Mike117.11 小时前
Oracle查看SQL执行计划的方法
数据库·sql·oracle
Coding小公仔12 小时前
GROUP BY SQL
数据库·sql
苏坡爱豆的笑容都没你的甜13 小时前
SQL Driver
数据库·sql·microsoft
caihuayuan414 小时前
人工智能与教育科技:2025年个性化学习的新模式
java·大数据·sql·spring·课程设计
Elastic 中国社区官方博客17 小时前
我们是如何为 ES|QL 重建自动补全功能的
大数据·数据库·sql·elasticsearch·搜索引擎·全文检索
ArabySide19 小时前
【SQL server】 SQL子查询:与连接的区别、类型划分、相关与非相关子查询对比
数据库·sql·sqlserver
在未来等你1 天前
高级SQL技巧:PostgreSQL高级特性实战指南
sql·postgresql·性能优化·窗口函数·递归查询