记录一次 postgresql 优化案例( volatility 自定义函数无法并行查询 )

同事最近做个金融适配项目,找我看条SQL,告知ORACLE跑1分钟,PG要跑30分钟(其实并没有这么夸张)😎, 废话不说,贴慢SQL。

慢SQL(关键信息已经加密)

复制代码
explain analyze
SELECT 
    c_qxxxxaode,
    '2023-03-22 00:00:00' AS d_cdate,
    SUM(CASE WHEN l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60) THEN 1 ELSE 0 END) AS bt5ycusts,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36) THEN 1 ELSE 0 END) AS bt3to5ycusts,
    SUM(CASE WHEN l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36) THEN 1 ELSE 0 END) AS bt3ycusts,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12) THEN 1 ELSE 0 END) AS bt1yto3ycusts,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6) THEN 1 ELSE 0 END) AS bt6to12mcusts,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3) THEN 1 ELSE 0 END) AS bt3to6mcusts,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3) THEN 1 ELSE 0 END) AS btlose3mcusts,
    SUM(CASE WHEN l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60) THEN f_qwwvddvvzz ELSE 0 END) AS bt5yshares,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36) THEN f_qwwvddvvzz ELSE 0 END) AS bt3to5yshares,
    SUM(CASE WHEN l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36) THEN f_qwwvddvvzz ELSE 0 END) AS bt3yshares,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12) THEN f_qwwvddvvzz ELSE 0 END) AS bt1to3yshares,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6) THEN f_qwwvddvvzz ELSE 0 END) AS bt6to12mshares,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3) THEN f_qwwvddvvzz ELSE 0 END) AS bt3to6mshares,
    SUM(CASE WHEN l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3) THEN f_qwwvddvvzz ELSE 0 END) AS btlose3mshares,
    round(AVG(months_between('2023-03-22 00:00:00', l.f_qdqdqdq)), 2) AS avgmonth,
    round(AVG(nvl(f_qwwvddvvzz, 0)), 2) AS avgshares,
    COUNT(a_daccoxxz) AS custsum,
    SUM(f_qwwvddvvzz) AS sharessum
FROM 
    stcdlbxxxxx l
WHERE 
    nvl(l.f_qwwvddvvzz, 0) > 0 AND 
    l.f_qdqdqdq <= '2023-03-22 00:00:00' AND 
    l.a_daccoxxz <> '996000000000' AND 
    c_qxxxxaode IN (SELECT c_qxxxxaode FROM  xdsfffs WHERE c_raisetype = '1')
GROUP BY 
    c_qxxxxaode;

执行计划:

复制代码
HashAggregate  (cost=1043326.49..1043332.91 rows=151 width=424) (actual time=381246.429..381246.640 rows=150 loops=1)
  Group Key: l.c_qxxxxaode
  ->  Hash Semi Join  (cost=8.78..936347.95 rows=301348 width=38) (actual time=0.057..30237.230 rows=30056793 loops=1)
        Hash Cond: (l.c_qxxxxaode = tfundinfo.c_qxxxxaode)
        ->  Seq Scan on stcdlbxxxxx l  (cost=0.00..906618.70 rows=10044941 width=38) (actual time=0.008..25908.814 rows=30157190 loops=1)
"              Filter: ((NVL(f_qwwvddvvzz, '0'::numeric) > '0'::numeric) AND (f_qdqdqdq <= '2023-03-22 00:00:00'::timestamp without time zone) AND (a_daccoxxz <> '996000000000'::text))"
              Rows Removed by Filter: 4842810
        ->  Hash  (cost=6.91..6.91 rows=150 width=8) (actual time=0.046..0.047 rows=150 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 14kB
              ->  Index Only Scan using idx_tfundinfo_fundcode on  xdsfffs  (cost=0.28..6.91 rows=150 width=8) (actual time=0.021..0.037 rows=150 loops=1)
                    Index Cond: (c_raisetype = '1'::text)
                    Heap Fetches: 0
Planning Time: 0.512 ms
Execution Time: 381246.699 ms
复制代码
select count(1) from stcdlbxxxxx;
  count   
----------
 35000000
(1 row)

stcdlbxxxxx 表数据量 3500W,数据量不算大,这条SQL主要含义应该是对 fact 表做统计,一堆聚合函数,fact 表是数仓概念,如果不明白可以去百度。

这条SQL主要慢在HashAggregate这个节点上,Hash Semi Join 花了 30237.230毫秒(30秒),然后到上面 Group Key: l.c_qxxxxaode 分个组以后,HashAggregate 直接飙到 381246.640毫秒(6.3分钟)。

其实这条SQL最理想的状态是走 HashAggregate + parallel 的计划,但是优化器并没有这样做,我怀疑可能是SQL写法导致优化器没有走并行,没多想,直接改写了一版的SQL让同事去试试。****

改写版本SQL:(PG独占的语法,聚合函数新增 FILTER 属性,代替 case when 写法)

复制代码
explain analyze
SELECT  c_qxxxxaode,
        '2023-03-22 00:00:00' AS d_cdate,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60)) AS bt5ycusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3to5ycusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3ycusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12)) AS bt1yto3ycusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6)) AS bt6to12mcusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3)) AS bt3to6mcusts,
        COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3)) AS btlose3mcusts,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60)) AS bt5yshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3to5yshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3yshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12)) AS bt1to3yshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6)) AS bt6to12mshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3)) AS bt3to6mshares,
        SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3)) AS btlose3mshares,
        ROUND(AVG(months_between('2023-03-22 00:00:00', l.f_qdqdqdq)), 2) AS avgmonth,
        ROUND(AVG(NVL(f_qwwvddvvzz, 0)), 2) AS avgshares,
        COUNT(a_daccoxxz) AS custsum,
        SUM(f_qwwvddvvzz) AS sharessum
FROM stcdlbxxxxx l
WHERE NVL(l.f_qwwvddvvzz, 0) > 0
  AND l.f_qdqdqdq <= '2023-03-22 00:00:00'
  AND l.a_daccoxxz <> '996000000000'
  AND c_qxxxxaode IN (SELECT c_qxxxxaode FROM  xdsfffs WHERE c_raisetype = '1')
GROUP BY c_qxxxxaode;

改写后执行计划:

复制代码
HashAggregate  (cost=1043326.49..1043332.91 rows=151 width=424) (actual time=380246.621..380246.849 rows=150 loops=1)
  Group Key: l.c_qxxxxaode
  ->  Hash Semi Join  (cost=8.78..936347.95 rows=301348 width=38) (actual time=0.055..29983.463 rows=30056793 loops=1)
        Hash Cond: (l.c_qxxxxaode = tfundinfo.c_qxxxxaode)
        ->  Seq Scan on stcdlbxxxxx l  (cost=0.00..906618.70 rows=10044941 width=38) (actual time=0.008..25415.490 rows=30157190 loops=1)
"              Filter: ((NVL(f_qwwvddvvzz, '0'::numeric) > '0'::numeric) AND (f_qdqdqdq <= '2023-03-22 00:00:00'::timestamp without time zone) AND (a_daccoxxz <> '996000000000'::text))"
              Rows Removed by Filter: 4842810
        ->  Hash  (cost=6.91..6.91 rows=150 width=8) (actual time=0.043..0.044 rows=150 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 14kB
              ->  Index Only Scan using idx_tfundinfo_fundcode on  xdsfffs  (cost=0.28..6.91 rows=150 width=8) (actual time=0.018..0.035 rows=150 loops=1)
                    Index Cond: (c_raisetype = '1'::text)
                    Heap Fetches: 0
Planning Time: 0.533 ms
Execution Time: 380246.909 ms

当时我看到这个改写后执行计划就无语了😶,和原SQL的计划一毛一样,直接把我整懵逼,怀疑是否现场环境没开并行参数,但是被同事告知是已经设置了并行参数。

复制代码
qtbg=> show max_parallel_workers_per_gather 
qtbg-> ;
 max_parallel_workers_per_gather 
---------------------------------
 16
(1 row)

了解到情况以后开始研究SQL,发现使用了两个自定义函数,add_months,months_between(我们研发写的函数、为了兼容ORACLE),然后看了下这两个函数的信息。

复制代码
qtbg=> \df+ add_months
                                                                                                                       List of functions
 Schema |    Name    | Result data type |        Argument data types        | Type | Volatility | Parallel | Owner  | Security | Access privileges | Language |                                   Source code                                    | Description 
--------+------------+------------------+-----------------------------------+------+------------+----------+--------+----------+-------------------+----------+----------------------------------------------------------------------------------+-------------
 sys    | add_months | pg_catalog.date  | pg_catalog.date, boolean          | func | immutable  | safe     | system | invoker  |                   | c        | add_months_bool                                                                  | 
 sys    | add_months | pg_catalog.date  | pg_catalog.date, integer          | func | immutable  | safe     | system | invoker  |                   | c        | add_months                                                                       | 
 sys    | add_months | date             | timestamp with time zone, integer | func | immutable  | safe     | system | invoker  |                   | sql      |  select (add_months($1::pg_catalog.date, $2)  + $1::pg_catalog.time)::sys.date;  | 
(3 rows)

qtbg=> \df+ months_between
                                                                                                                                                           List of functions
 Schema |      Name      | Result data type |                Argument data types                 | Type | Volatility | Parallel | Owner  | Security | Access privileges | Language |                                                             Source code                                                             | Descri
ption 
--------+----------------+------------------+----------------------------------------------------+------+------------+----------+--------+----------+-------------------+----------+-------------------------------------------------------------------------------------------------------------------------------------+-------
------
 sys    | months_between | double precision | date, date                                         | func | volatile   | unsafe   | system | invoker  |                   | plsql    |                                                                                                                                    +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |   begin                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    case when                                                                                                                       +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (last_day($1) = $1 and last_day($2) = $2)                                                                                     +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     or                                                                                                                             +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(day from $1) = extract(day from $2))                                                                                 +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    then                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     return (select (extract(years from $1)::int * 12 - extract(years from $2)::int * 12)::float +                                  +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(month from $1)::int - extract(month from $2)::int)::float);                                                          +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    else                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     return (select (extract(years from $1)::int * 12 - extract(years from $2)::int * 12)::float +                                  +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(month from $1)::int - extract(month from $2)::int)::float +                                                          +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(day from $1)::int - extract(day from $2)::int)/31::float +                                                           +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(hour from $1)::int * 3600 + extract(minutes from $1)::int * 60 + extract(seconds from $1)::int)/(3600*24*31)::float -+| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(hour from $2)::int * 3600 + extract(minutes from $2)::int * 60 + extract(seconds from $2)::int)/(3600*24*31)::float);+| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    end case;                                                                                                                       +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |   end;                                                                                                                             +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |                                                                                                                                     | 
 sys    | months_between | double precision | timestamp with time zone, timestamp with time zone | func | volatile   | unsafe   | system | invoker  |                   | plsql    |                                                                                                                                    +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |   begin                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    case when                                                                                                                       +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (last_day($1) = $1 and last_day($2) = $2)                                                                                     +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     or                                                                                                                             +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(day from $1) = extract(day from $2))                                                                                 +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    then                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     return (select (extract(years from $1)::int * 12 - extract(years from $2)::int * 12)::float +                                  +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(month from $1)::int - extract(month from $2)::int)::float);                                                          +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    else                                                                                                                            +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |     return (select (extract(years from $1)::int * 12 - extract(years from $2)::int * 12)::float +                                  +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(month from $1)::int - extract(month from $2)::int)::float +                                                          +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(day from $1)::int - extract(day from $2)::int)/31::float +                                                           +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(hour from $1)::int * 3600 + extract(minutes from $1)::int * 60 + extract(seconds from $1)::int)/(3600*24*31)::float -+| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |      (extract(hour from $2)::int * 3600 + extract(minutes from $2)::int * 60 + extract(seconds from $2)::int)/(3600*24*31)::float);+| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |    end case;                                                                                                                       +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |   end;                                                                                                                             +| 
        |                |                  |                                                    |      |            |          |        |          |                   |          |                                                                                                                                     | 
(2 rows)

当时就给我整崩溃了,研发真的太坑啦😂😂😂。

months_between 函数居然还是 volatile(不稳定)状态,我一直以为研发写好的ORACLE 兼容函数都是 immutable 或者 是 stable 状态,所以刚开始也没往这方面想。

PG的自定义函数有三种状态,volatile、immutable 、stable 。

volatile 是不可以走并行的,这三种状态具体代表的是什么含义,如果不清楚的同学可以下去百度,这里就不废话了,我也怕解释不清楚。😁

和客户沟通以后,我写了一个 months_between1 函数代替原来的 months_between函数,返回结果小数点的精度可能和ORACLE有点区别,但是客户表示可以接受。

months_between1 函数:

复制代码
CREATE OR REPLACE FUNCTION months_between1(date1 DATE, date2 DATE) 
RETURNS FLOAT AS $$
DECLARE
    years_diff INT;
    months_diff INT;
    days_diff FLOAT;
BEGIN
    years_diff := EXTRACT(YEAR FROM date1) - EXTRACT(YEAR FROM date2);
    months_diff := EXTRACT(MONTH FROM date1) - EXTRACT(MONTH FROM date2);
    days_diff := EXTRACT(DAY FROM date1) - EXTRACT(DAY FROM date2);
    days_diff := days_diff / 30.0;  
    RETURN (years_diff * 12) + months_diff + days_diff;
END;
$$ LANGUAGE plpgsql IMMUTABLE;

months_between1 设置可以并行:

复制代码
ALTER FUNCTION months_between1(date, date) PARALLEL SAFE;

months_between1 函数信息:

复制代码
\df+ months_between1                                           
                                                                                                                List of functions
 Schema |      Name       | Result data type |  Argument data types   | Type | Volatility | Parallel | Owner | Security | Access privileges | Language |                                Source code                                | Description 
--------+-----------------+------------------+------------------------+------+------------+----------+-------+----------+-------------------+----------+---------------------------------------------------------------------------+-------------
 scott  | months_between1 | double precision | date1 date, date2 date | func | immutable  | safe     | qtbg  | invoker  |                   | plpgsql  |                                                                          +| 
        |                 |                  |                        |      |            |          |       |          |                   |          | DECLARE                                                                  +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     years_diff INT;                                                      +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     months_diff INT;                                                     +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     days_diff FLOAT;                                                     +| 
        |                 |                  |                        |      |            |          |       |          |                   |          | BEGIN                                                                    +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     years_diff := EXTRACT(YEAR FROM date1) - EXTRACT(YEAR FROM date2);   +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     months_diff := EXTRACT(MONTH FROM date1) - EXTRACT(MONTH FROM date2);+| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     days_diff := EXTRACT(DAY FROM date1) - EXTRACT(DAY FROM date2);      +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     days_diff := days_diff / 30.0;                                       +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |     RETURN (years_diff * 12) + months_diff + days_diff;                  +| 
        |                 |                  |                        |      |            |          |       |          |                   |          | END;                                                                     +| 
        |                 |                  |                        |      |            |          |       |          |                   |          |                                                                           | 
(1 row)

最后执行SQL测试效率:

复制代码
 1 explain analyze
 2 SELECT  c_qxxxxaode,
 3         '2023-03-22 00:00:00' AS d_cdate,
 4         COUNT(*) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60)) AS bt5ycusts,
 5         COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3to5ycusts,
 6         COUNT(*) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3ycusts,
 7         COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12)) AS bt1yto3ycusts,
 8         COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6)) AS bt6to12mcusts,
 9         COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3)) AS bt3to6mcusts,
10         COUNT(*) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3)) AS btlose3mcusts,
11         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -60)) AS bt5yshares,
12         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -60) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3to5yshares,
13         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -36)) AS bt3yshares,
14         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -36) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -12)) AS bt1to3yshares,
15         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -12) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -6)) AS bt6to12mshares,
16         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -6) AND l.f_qdqdqdq <= add_months('2023-03-22 00:00:00', -3)) AS bt3to6mshares,
17         SUM(f_qwwvddvvzz) FILTER (WHERE l.f_qdqdqdq > add_months('2023-03-22 00:00:00', -3)) AS btlose3mshares,
18         ROUND(AVG(months_between1('2023-03-22 00:00:00', l.f_qdqdqdq)), 2) AS avgmonth,
19         ROUND(AVG(NVL(f_qwwvddvvzz, 0)), 2) AS avgshares,
20         COUNT(a_daccoxxz) AS custsum,
21         SUM(f_qwwvddvvzz) AS sharessum
22 FROM stcdlbxxxxx l
23 WHERE NVL(l.f_qwwvddvvzz, 0) > 0
24   AND l.f_qdqdqdq <= '2023-03-22 00:00:00'
25   AND l.a_daccoxxz <> '996000000000'
26   AND c_qxxxxaode IN (SELECT c_qxxxxaode FROM  xdsfffs WHERE c_raisetype = '1')
27 GROUP BY c_qxxxxaode;

执行计划:

复制代码
Finalize GroupAggregate  (cost=406085.34..424225.64 rows=151 width=424) (actual time=9533.754..16818.165 rows=150 loops=1)
  Group Key: l.c_qxxxxaode
  ->  Gather Merge  (cost=406085.34..424155.80 rows=906 width=392) (actual time=9484.778..16812.958 rows=1050 loops=1)
        Workers Planned: 6
        Workers Launched: 6
        ->  Partial GroupAggregate  (cost=405085.25..423045.59 rows=151 width=392) (actual time=9383.774..16482.619 rows=150 loops=7)
              Group Key: l.c_qxxxxaode
              ->  Sort  (cost=405085.25..405210.81 rows=50225 width=38) (actual time=9358.407..10328.135 rows=4293828 loops=7)
                    Sort Key: l.c_qxxxxaode
                    Sort Method: quicksort  Memory: 461463kB
                    Worker 0:  Sort Method: quicksort  Memory: 452754kB
                    Worker 1:  Sort Method: quicksort  Memory: 452822kB
                    Worker 2:  Sort Method: quicksort  Memory: 451891kB
                    Worker 3:  Sort Method: quicksort  Memory: 448327kB
                    Worker 4:  Sort Method: quicksort  Memory: 454387kB
                    Worker 5:  Sort Method: quicksort  Memory: 453623kB
                    ->  Hash Semi Join  (cost=8.78..401163.65 rows=50225 width=38) (actual time=0.219..4369.485 rows=4293828 loops=7)
                          Hash Cond: (l.c_qxxxxaode = tfundinfo.c_qxxxxaode)
                          ->  Parallel Seq Scan on stcdlbxxxxx l  (cost=0.00..396201.45 rows=1674157 width=38) (actual time=0.017..3948.943 rows=4308170 loops=7)
"                                Filter: ((NVL(f_qwwvddvvzz, '0'::numeric) > '0'::numeric) AND (f_qdqdqdq <= '2023-03-22 00:00:00'::timestamp without time zone) AND (a_daccoxxz <> '996000000000'::text))"
                                Rows Removed by Filter: 691830
                          ->  Hash  (cost=6.91..6.91 rows=150 width=8) (actual time=0.164..0.173 rows=150 loops=7)
                                Buckets: 1024  Batches: 1  Memory Usage: 14kB
                                ->  Index Only Scan using idx_tfundinfo_fundcode on  xdsfffs  (cost=0.28..6.91 rows=150 width=8) (actual time=0.072..0.151 rows=150 loops=7)
                                      Index Cond: (c_raisetype = '1'::text)
                                      Heap Fetches: 0
Planning Time: 0.588 ms
Execution Time: 16851.196 ms

最后可以看到原来 6分钟的SQL,现在已经 16.8秒就能跑出结果,研发真的太坑了。🤣🤣🤣🤣

其实这条SQL完全不用改成 (聚合函数新增 FILTER 属性)这种写法,原来的SQL只要把 months_between 函数替换成 months_between1,一样也可以走并行。

但是为了更好看,还是把这种新的写法提交给了客户。😁😁 😁 😁 😁

相关推荐
Goona_6 小时前
拒绝SQL恐惧:用Python+pyqt打造任意Excel数据库查询系统
数据库·python·sql·excel·pyqt
rufeii9 小时前
[极客大挑战 2019]FinalSQL--布尔盲注
sql
技术卷14 小时前
详解力扣高频SQL50题之1084. 销售分析 III【简单】
sql·leetcode·oracle
NPE~16 小时前
基于MySQL实现基础图数据库
数据库·sql·mysql·教程·图数据库·图结构
技术卷16 小时前
详解力扣高频SQL50题之550. 游戏玩法分析 IV【中等】
sql·mysql·leetcode·oracle
样子201817 小时前
Sql注入 之sqlmap使用教程
数据库·sql
技术卷1 天前
详解力扣高频 SQL 50 题之584. 寻找用户推荐人【入门】
sql·leetcode·oracle
ALLSectorSorft1 天前
教务管理系统学排课教务系统模块设计
数据库·sql·oracle
笑衬人心。1 天前
后端项目中大量 SQL 执行的性能优化
sql·spring·性能优化
····懂···2 天前
关于PGCE专家技术认证解决方案
数据库·postgresql