【GaussDB】数据静止状态下同一个SQL或同一个存储过程执行第6次报错的问题排查

【GaussDB】数据静止状态下同一个SQL或同一个存储过程执行第6次报错的问题排查

背景

最近有客户遇到了一个问题,原始反馈是

  • 客户:"我在GaussDB里调用这个sp报错,相同sp在ORACLE里调用不报错"
  • 我:"报什么错?"
  • 客户:"invalid input syntax for type numeric: "xxx""

然后我连接上这个报错的GaussDB数据库,执行客户提供的调用sp的例子,却没有报错。

数据库版本:506.0.0.SPC0100

定位过程

  • 我到客户工位上,已经开好的连接上执行,的确报错
  • 我还是在客户工位上,客户端新开了一个连接,执行不报错
  • 检查sp,里面没有使用到任何动态的东西,就是一些局部变量的赋值和判断,连表都没有查
  • 难道是sp局部变量有cache没清理?但按sp逻辑来看,就算没清理,也不应该会报这个错。
  • 然后客户提供一个重要信息:"前面执行也有没报错的时候"
  • 我意识到,难道是同一个会话循环调用n次后报错?
  • 然后我写了个匿名块循环执行,第6次报错了
  • 新开一个窗口再测,同样是第6次就报错了。

为了快速精确定位报错的SQL,我在mogeaver里单步调试这个循环6次调用存储过程的匿名块,前5次step over,第6次step into,跟踪到了一个select case when into语句执行后跳到了exception,其中的else 分支里有to_number(字符串变量),这个字符串变量是可能不能转换成数字的,但所有不能转换成数字的情况,会被前面的case捕获掉,因此真实业务情况下,这个to_number一定不会报错的。

复现用例

下面是简单的复现用例:

sql 复制代码
declare
v1 varchar2(20);
v2 number;
begin
    v1:='a';
    for i in 1..6 loop
        select case when v1='a' then 1 else to_number(v1) end from dual into v2;
    raise notice '%',v2;
    end loop;
exception when others then 
    raise notice 'v1:%,sqlerrm:%',v1,sqlerrm;
end;
sql 复制代码
gaussdb=# declare
gaussdb-# v1 varchar2(20);
gaussdb-# v2 number;
gaussdb-# begin
gaussdb$#     v1:='a';
gaussdb$#     for i in 1..6 loop
gaussdb$#         select case when v1='a' then 1 else to_number(v1) end from dual into v2;
gaussdb$#     raise notice '%',v2;
gaussdb$#     end loop;
gaussdb$# exception when others then
gaussdb$#     raise notice 'v1:%,sqlerrm:%',v1,sqlerrm;
gaussdb$# end;
gaussdb$# /
NOTICE:  1
NOTICE:  1
NOTICE:  1
NOTICE:  1
NOTICE:  1
NOTICE:  v1:a,sqlerrm:invalid input syntax for type numeric: "a"
ANONYMOUS BLOCK EXECUTE
gaussdb=#

原因分析

在PG/OG系数据库中,6是一个必须要记住的数字,同一个SQL从cplan变到gplan的次数正是6次。

加上强制gplan的hint,第一次就会报错

sql 复制代码
gaussdb=# declare
gaussdb-# v1 varchar2(20);
gaussdb-# v2 number;
gaussdb-# begin
gaussdb$#     v1:='a';
gaussdb$#     for i in 1..6 loop
gaussdb$#         select /*+use_gplan*/ case when v1='a' then 1 else to_number(v1) end from sys_dummy into v2;
gaussdb$#     raise notice '%',v2;
gaussdb$#     end loop;
gaussdb$# exception when others then
gaussdb$#     raise notice 'v1:%,sqlerrm:%',v1,sqlerrm;
gaussdb$# end;
gaussdb$# /
NOTICE:  v1:a,sqlerrm:invalid input syntax for type numeric: "a"
ANONYMOUS BLOCK EXECUTE
gaussdb=#

加上强制cplan的hint,则一直都不报错

sql 复制代码
gaussdb=# declare
gaussdb-# v1 varchar2(20);
gaussdb-# v2 number;
gaussdb-# begin
gaussdb$#     v1:='a';
gaussdb$#     for i in 1..6 loop
gaussdb$#         select /*+use_cplan*/ case when v1='a' then 1 else to_number(v1) end from sys_dummy into v2;
gaussdb$#     raise notice '%',v2;
gaussdb$#     end loop;
gaussdb$# exception when others then
gaussdb$#     raise notice 'v1:%,sqlerrm:%',v1,sqlerrm;
gaussdb$# end;
gaussdb$# /
NOTICE:  1
NOTICE:  1
NOTICE:  1
NOTICE:  1
NOTICE:  1
NOTICE:  1
ANONYMOUS BLOCK EXECUTE
gaussdb=#

对于同一个SQL,前5次会根据实际的参数值去生成计划执行(cplan),到第6次(gplan)就不管参数值是什么了,因此对于case when所有的分支都会去判断,此时对于实际不会执行的分支to_number(v1)也执行了计算,因此产生了报错。

使用不报错的参数,跟踪第6次的执行计划和前5次的区别

sql 复制代码
gaussdb=# set auto_explain_level=notice;
SET
gaussdb=# set enable_auto_explain=on;
SET
gaussdb=#
gaussdb=# declare
gaussdb-# v1 varchar2(20);
gaussdb-# v2 number;
gaussdb-# begin
gaussdb$#     v1:='1';
gaussdb$#     for i in 1..6 loop
gaussdb$#         select case when v1='1' then 1 else to_number(v1) end from sys_dummy into v2;
gaussdb$#     raise notice '%',v2;
gaussdb$#     end loop;
gaussdb$# exception when others then
gaussdb$#     raise notice 'v1:%,sqlerrm:%',v1,sqlerrm;
gaussdb$# end;
gaussdb$# /
NOTICE:
QueryPlan

----------------------------NestLevel:0----------------------------
Query Text: select case when v1='1' then 1 else to_number(v1) end from sys_dummy
Name: primary
Subquery Scan on sys_dummy  (cost=0.00..0.02 rows=1 width=0)
  Output: 1::numeric, (Expression Flatten Optimized)
  ->  Result  (cost=0.00..0.01 rows=1 width=0)
        Output: NULL::text, (Expression Flatten Optimized)


CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:
----------------------------NestLevel:0----------------------------
duration: 0.000 s

CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:  1
NOTICE:
QueryPlan

----------------------------NestLevel:0----------------------------
Query Text: select case when v1='1' then 1 else to_number(v1) end from sys_dummy
Name: primary
Subquery Scan on sys_dummy  (cost=0.00..0.02 rows=1 width=0)
  Output: 1::numeric, (Expression Flatten Optimized)
  ->  Result  (cost=0.00..0.01 rows=1 width=0)
        Output: NULL::text, (Expression Flatten Optimized)


CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:
----------------------------NestLevel:0----------------------------
duration: 0.000 s

CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:  1
NOTICE:
QueryPlan

----------------------------NestLevel:0----------------------------
Query Text: select case when v1='1' then 1 else to_number(v1) end from sys_dummy
Name: primary
Subquery Scan on sys_dummy  (cost=0.00..0.02 rows=1 width=0)
  Output: 1::numeric, (Expression Flatten Optimized)
  ->  Result  (cost=0.00..0.01 rows=1 width=0)
        Output: NULL::text, (Expression Flatten Optimized)


CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:
----------------------------NestLevel:0----------------------------
duration: 0.000 s

CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:  1
NOTICE:
QueryPlan

----------------------------NestLevel:0----------------------------
Query Text: select case when v1='1' then 1 else to_number(v1) end from sys_dummy
Name: primary
Subquery Scan on sys_dummy  (cost=0.00..0.02 rows=1 width=0)
  Output: 1::numeric, (Expression Flatten Optimized)
  ->  Result  (cost=0.00..0.01 rows=1 width=0)
        Output: NULL::text, (Expression Flatten Optimized)


CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:
----------------------------NestLevel:0----------------------------
duration: 0.000 s

CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:  1
NOTICE:
QueryPlan

----------------------------NestLevel:0----------------------------
Query Text: select case when v1='1' then 1 else to_number(v1) end from sys_dummy
Name: primary
Subquery Scan on sys_dummy  (cost=0.00..0.02 rows=1 width=0)
  Output: 1::numeric, (Expression Flatten Optimized)
  ->  Result  (cost=0.00..0.01 rows=1 width=0)
        Output: NULL::text, (Expression Flatten Optimized)


CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:
----------------------------NestLevel:0----------------------------
duration: 0.000 s

CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:  1
NOTICE:
QueryPlan

----------------------------NestLevel:0----------------------------
Query Text: select case when v1='1' then 1 else to_number(v1) end from sys_dummy
Name: primary
Subquery Scan on sys_dummy  (cost=0.00..0.03 rows=1 width=0)
  Output: CASE WHEN (($10 <param value: 1 type: varchar>)::text = '1'::text) THEN 1::numeric ELSE numeric_in(textout(($10 <param value: 1 type: varchar>)::text), 0::oid, (-1)) END, (Expression Flatten Optimized)
  ->  Result  (cost=0.00..0.01 rows=1 width=0)
        Output: NULL::text, (Expression Flatten Optimized)


CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:
----------------------------NestLevel:0----------------------------
duration: 0.000 s

CONTEXT:  SQL statement "select case when v1='1' then 1 else to_number(v1) end from sys_dummy"
PL/pgSQL function inline_code_block line 7 at SQL statement
NOTICE:  1
ANONYMOUS BLOCK EXECUTE
gaussdb=#

从前5次的执行计划中可以发现,output已经把结果值算出来了,投影列是1::numeric,但第6次的投影列则是带参数的case when表达式CASE WHEN (($10 <param value: 1 type: varchar>)::text = '1'::text) THEN 1::numeric ELSE numeric_in(textout(($10 <param value: 1 type: varchar>)::text), 0::oid, (-1)) END

不过这个用例在openGauss及基于openGauss的商业发行版、还有原生的postgresql上,都不会报错,只有华为的GaussDB会报这个错。

看下报错堆栈

c 复制代码
(gdb) bt
#0  errstart (elevel=20, filename=0x5648320cb206 "numeric_util.cpp", lineno=175, funcname=0x5648320cb270 <adt_set_var_from_str(char const*, char const*, AdtNumericVar*)::__func__> "adt_set_var_from_str", domain=0x564831dc0004 "plpgsql-9.2") at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp:238
#1  0x000056482dae8ac8 in adt_set_var_from_str (str=0x7fdeea6be2a0 "a", cp=0x7fdeea6be2a0 "a", dest=0x7fdfda7c3e90) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/adt/numeric_util.cpp:171
#2  0x000056482dac2a31 in numeric_in (fcinfo=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/adt/numeric.cpp:343
#3  0x000056482dd18bf6 in exec_eval_func_expr_strict (op=0x7fdfd402ad90) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_expr_interp_external.cpp:676
#4  0x000056482dd0fe56 in exec_interp_expr (state=0x7fdfd402a470, econtext=<optimized out>, isnull=<optimized out>, is_done=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_expr_interp.cpp:279
#5  0x000056482dd8d6bd in exec_eval_switch_exprcxt (is_done=0x0, is_null=0x7fdfda7c407c, econtext=<optimized out>, expression=0x7fdfd402a470) at /usr1/GaussDBKernel/server/opengauss/src/include/executor/executor.h:427
#6  optbase_evaluate_expr (expr=<optimized out>, result_type=1700, result_typmod=-1, result_collation=0, can_ignore=<optimized out>, max_length=0, decimals=0) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/opt_clauses.cpp:5654
#7  0x000056482dd90391 in optbase_evaluate_function (context=0x7fdfda7c4340, func_tuple=0x7fdfcdab30d8, args=0x7fdeea6be398, decimals=0, max_length=0, input_collid=0, result_collid=0, result_typmod=-1, result_type=1700, funcid=1701) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/opt_clauses.cpp:5069
#8  optbase_simplify_function (funcid=1701, result_type=1700, result_typmod=-1, result_collid=0, input_collid=0, max_length=0, decimals=0, args_p=0x7fdfda7c42b8, process_args=true, allow_non_const=true, context=0x7fdfda7c4340) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/opt_clauses.cpp:4235
#9  0x000056482dd91127 in optbase_eval_const_expressions_funcexpr (node=0x7fdfd2067bd8, context=0x7fdfda7c4340) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/opt_clauses.cpp:2795
#10 0x000056482dd91265 in optbase_eval_funcexpr_mutator (node=<optimized out>, context=0x7fdfda7c4340) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/opt_clauses.cpp:2421
#11 0x000056482dd912d2 in optbase_eval_plan_funcexpr (root=0x0, node=<optimized out>, estate=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/opt_clauses.cpp:2412
#12 0x000056482dd0e24b in exec_init_expr_rec (node=0x7fdfd2fd79f0, state=0x7fdfd2066b10, resv=0x7fdfd2066c38, resnull=0x7fdfd2066c31, parent_node=0x7fdfd2fd7530) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_expr_internal.cpp:342
#13 0x000056482dd0d29b in exec_init_expr_rec (node=<optimized out>, state=0x7fdfd2066b10, resv=<optimized out>, resnull=<optimized out>, parent_node=0x0) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_expr_internal.cpp:817
#14 0x000056482dd09c22 in exec_build_projinfo_byflt (target_list=<optimized out>, econtext=<optimized out>, slot=<optimized out>, parent=0x7fdfd2066050, input_desc=0x7fdfd3f28610) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_expr_external.cpp:373
#15 0x000056482dd64efe in exec_assign_projinfo (planstate=0x7fdfd2066050, input_desc=0x7fdfd3f28610) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_utils.cpp:940
#16 0x000056482eb73041 in exec_assign_scanprojinfo (node=0x7fdfd2066050) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/executor/rowengine/framework/executor_framework_execscan.cpp:217
#17 0x000056482eb558db in SubqueryScanOperator::init (this=0x7fdfd2066050, node=0x7fdfd2fd7050, estate=0x7fdfd3f52050, eflags=16) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/executor/rowengine/executor_rowopr_subqueryscan.cpp:74
#18 0x000056482eb55954 in exec_init_subqueryscan (node=0x7fdfd2fd7050, estate=0x7fdfd3f52050, eflags=16) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/executor/rowengine/executor_rowopr_subqueryscan.cpp:87
#19 0x000056482dd2d9df in ExecInitNodeByType (node=0x7fdfd2fd7050, estate=0x7fdfd3f52050, eflags=16) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_main.cpp:3913
#20 0x000056482dd2e658 in ExecInitNode (node=0x7fdfd2fd7050, estate=0x7fdfd3f52050, e_flags=16, bii_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_main.cpp:4153
#21 0x000056482dd2f2bc in exec_init_plan (queryDesc=0x7fdfd3f50050, eflags=16) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_main.cpp:1840
#22 0x000056482dd2fa92 in exec_standard_executor_start (queryDesc=0x7fdfd3f50050, eflags=16) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_main.cpp:572
#23 0x00007fe144da177a in gs_audit_executor_start_hook (queryDesc=0x7fdfd3f50050, eflags=0) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/security/security_plugin/security_policy_plugin.cpp:1849
#24 0x000056482e4fe1a3 in sqlcmd_explain_executor_start (query_desc=0x7fdfd3f50050, eflags=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/commands/auto_explain.cpp:80
#25 0x000056482dd2ffec in exec_executor_start (queryDesc=0x7fdfd3f50050, eflags=0) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_main.cpp:328
#26 0x000056482ebb03d7 in spi_pquery (from_lock=false, tcount=2, fire_triggers=<optimized out>, query_desc=0x7fdfd3f50050, cplan=0x7fdeea4fb518) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/pl/spi/spi.cpp:2662
#27 spi_execute_plan0 (plan=0x7fdfd3f42450, param_list=0x7fdfd3fad1a0, snapshot=0x0, crosscheck_snapshot=0x0, read_only=false, fire_triggers=true, tcount=2, from_lock=false, dfx_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/pl/spi/spi.cpp:2210
#28 0x000056482ebb0a53 in spi_execute_plan (plan=0x7fdfd3f42450, param_list=0x7fdfd3fad1a0, snapshot=0x0, crosscheck_snapshot=0x0, read_only=false, fire_triggers=true, tcount=2, from_lock=false, dfx_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/pl/spi/spi.cpp:2535
#29 0x000056482ebb207a in SPI_execute_plan_with_paramlist (plan=0x7fdfd3f42450, params=0x7fdfd3fad1a0, read_only=<optimized out>, tcount=2) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/pl/spi/spi.cpp:502
#30 0x000056482e070f22 in gsplsql_exec_stmt_execsql (estate=0x7fdfd3f5e5d0, stmt=0x7fdfd3ee22d0, dfx_state=0x7fe140ab0470, is_open_satetement=false) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_exec/pl_exec_stmt_sql.cpp:972
#31 0x000056482e058144 in gsplsql_exec_stmt (estate=0x7fdfd3f5e5d0, stmt=0x7fdfd3ee22d0, dfx_state=0x7fe140ab0470) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_exec/pl_exec_stmt_block.cpp:712
#32 0x000056482e058682 in gsplsql_exec_stmts (estate=0x7fdfd3f5e5d0, stmts=0x7fdfd3ee24a0, dfx_state=0x7fdfd3faf490) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_exec/pl_exec_stmt_block.cpp:856
#33 0x000056482e067c0e in gsplsql_exec_stmt_fori (estate=0x7fdfd3f5e5d0, stmt=0x7fdfd3ee1d90, dfx_state=0x7fdfd3faf490) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_exec/pl_exec_stmt_loop.cpp:492
#34 0x000056482e05830a in gsplsql_exec_stmt (estate=0x7fdfd3f5e5d0, stmt=0x7fdfd3ee1d90, dfx_state=0x7fdfd3faf490) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_exec/pl_exec_stmt_block.cpp:681
#35 0x000056482e058682 in gsplsql_exec_stmts (estate=0x7fdfd3f5e5d0, stmts=0x7fdfd3ee16b8, dfx_state=0x7fdfd3faea00) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_exec/pl_exec_stmt_block.cpp:856
#36 0x000056482e056ec4 in gsplsql_exec_stmt_block (estate=0x7fdfd3f5e5d0, block=0x7fdfd3ee3420, dfx_state=0x7fdfd3faea00) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_exec/pl_exec_stmt_block.cpp:295
#37 0x000056482e058452 in gsplsql_exec_stmt (estate=0x7fdfd3f5e5d0, stmt=0x7fdfd3ee3420, dfx_state=0x7fdfd3faea00) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_exec/pl_exec_stmt_block.cpp:648
#38 0x000056482e058682 in gsplsql_exec_stmts (estate=0x7fdfd3f5e5d0, stmts=0x7fdfd3ee1328, dfx_state=0x7fdfd3fad5c0) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_exec/pl_exec_stmt_block.cpp:856
#39 0x000056482e058fb0 in gsplsql_exec_stmts_save_cursor (estate=0x7fdfd3f5e5d0, stmts=0x7fdfd3ee1328, dfx_state=0x7fdfd3fad5c0) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_exec/pl_exec_stmt_cursor.cpp:66
#40 0x000056482e057b36 in gsplsql_exec_stmt_block (estate=0x7fdfd3f5e5d0, block=0x7fdfd3ee3828, dfx_state=0x7fdfd3fad5c0) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_exec/pl_exec_stmt_block.cpp:428
#41 0x0000564830230ec5 in gsplsql_exec_function (func=0x7fdfd3f5e170, fcinfo=0x7fdfda7c6230, dynexec_anonymous_block=false, func_runtime_state=0x0, isDstoreAutonomous=false) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/pl/plpgsql/src/pl_exec.cpp:698
#42 0x0000564830242bea in plpgsql_inline_handler (fcinfo=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/pl/plpgsql/src/pl_handler.cpp:1124
#43 0x000056482de839c1 in OidFunctionCall2Coll (function_id=<optimized out>, collation=0, arg1=140599276921936, arg2=0, is_null=0x0) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/fmgr/fmgr.cpp:2355
#44 0x000056482e635b39 in sqlcmd_execute_do_stmt (stmt=0x7fdfd1f28ec0, atomic=<optimized out>, is_pkg_anonymous_block=97) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/commands/functioncmds.cpp:4242
#45 0x000056482ea0f27e in sqlcmd_standard_process_utility (parse_tree=0x7fdfd1f28ec0, query_string=0x7fdfd3f34ab8 "declare\nv1 varchar2(20);\nv2 number;\nbegin\n    v1:='a';\n    for i in 1..6 loop\n        select case when v1='a' then 1 else to_number(v1) end from dual into v2;\n    raise notice '%',v2;\n    end loop;\nexception when others then \n    raise notice 'v1:%,sqlerrm:%',v1,sqlerrm;\nend;", params=0x0, is_top_level=true, dest=0x7fdfd3f34fd8, sent_to_remote=<optimized out>, completion_tag=0x7fdfda7ca260 "", isCTAS=false) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/utility.cpp:7283
#46 0x00007fe144da3759 in gsaudit_ProcessUtility_hook (parsetree=0x7fdfd1f28ec0, queryString=0x7fdfd3f34ab8 "declare\nv1 varchar2(20);\nv2 number;\nbegin\n    v1:='a';\n    for i in 1..6 loop\n        select case when v1='a' then 1 else to_number(v1) end from dual into v2;\n    raise notice '%',v2;\n    end loop;\nexception when others then \n    raise notice 'v1:%,sqlerrm:%',v1,sqlerrm;\nend;", params=0x0, isTopLevel=<optimized out>, dest=0x7fdfd3f34fd8, sentToRemote=<optimized out>, completionTag=0x7fdfda7ca260 "", isCTAS=false) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/security/security_plugin/security_policy_plugin.cpp:856
#47 0x000056482ef7ff52 in audit_process_utility (parsetree=0x7fdfd1f28ec0, query_string=0x7fdfd3f34ab8 "declare\nv1 varchar2(20);\nv2 number;\nbegin\n    v1:='a';\n    for i in 1..6 loop\n        select case when v1='a' then 1 else to_number(v1) end from dual into v2;\n    raise notice '%',v2;\n    end loop;\nexception when others then \n    raise notice 'v1:%,sqlerrm:%',v1,sqlerrm;\nend;", params=<optimized out>, is_top_level=<optimized out>, dest=<optimized out>, sent_to_remote=<optimized out>, completion_tag=0x7fdfda7ca260 "", is_ctas=false) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/security/audit/security_auditfuncs.cpp:1512
#48 0x000056482ea1b71d in sqlcmd_process_utility (parse_tree=0x7fdfd1f28ec0, query_string=0x7fdfd3f34ab8 "declare\nv1 varchar2(20);\nv2 number;\nbegin\n    v1:='a';\n    for i in 1..6 loop\n        select case when v1='a' then 1 else to_number(v1) end from dual into v2;\n    raise notice '%',v2;\n    end loop;\nexception when others then \n    raise notice 'v1:%,sqlerrm:%',v1,sqlerrm;\nend;", params=0x0, is_top_level=<optimized out>, dest=<optimized out>, sent_to_remote=<optimized out>, completion_tag=0x7fdfda7ca260 "", isCTAS=false) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/utility.cpp:1974
#49 0x000056482e9fc83f in PortalRunUtility (portal=0x7fdfcfa7a050, utilityStmt=0x7fdfd1f28ec0, isTopLevel=true, dest=0x7fdfd3f34fd8, completionTag=0x7fdfda7ca260 "") at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/pquery.cpp:2140
#50 0x000056482e9fe0be in PortalRunMulti (portal=0x7fdfcfa7a050, isTopLevel=true, dest=0x7fdfd3f34fd8, altdest=0x7fdfd3f34fd8, completionTag=0x7fdfda7ca260 "", snapshot=0x0, bii_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/pquery.cpp:2326
#51 0x000056482ea022dc in PortalRun (portal=0x7fdfcfa7a050, count=9223372036854775807, isTopLevel=true, dest=0x7fdfd3f34fd8, altdest=0x7fdfd3f34fd8, completionTag=0x7fdfda7ca260 "", snapshot=0x0, bii_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/pquery.cpp:1501
#52 0x000056482e9e8276 in exec_simple_query (query_string=<optimized out>, msg=0x7fdfda7ca530, messageType=QUERY_MESSAGE) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/postgres.cpp:3513
#53 0x000056482e9f4e38 in gs_process_command (firstchar=<optimized out>, input_message=0x7fdfda7ca530, send_ready_for_query=0x7fdfda7ca526) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/postgres.cpp:11743
#54 0x000056482e9fa9c0 in PostgresMain (argc=<optimized out>, argv=0x7fdfd59f5b20, dbname=<optimized out>, username=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/postgres.cpp:11313
#55 0x000056482e97e2df in backend_run (port=0x7fdfda7ca890) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/postmaster/postmaster.cpp:12482
#56 0x000056482e9bd1b0 in gauss_db_worker_thread_main<(knl_thread_role)2> (arg=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/postmaster/postmaster.cpp:19086
#57 0x000056482e97e39a in internal_thread_func (args=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/postmaster/postmaster.cpp:20196
#58 0x00007fe14516ff1b in ?? () from /usr/lib64/libpthread.so.0
#59 0x00007fe1450a7320 in clone () from /usr/lib64/libc.so.6

f 21里找到了出错的sql

c 复制代码
(gdb) f 21
#21 0x000056482dd2f2bc in exec_init_plan (queryDesc=0x7fdfd3f50050, eflags=16) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_main.cpp:1840
1840    /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_main.cpp: No such file or directory.
(gdb) p *queryDesc
$1 = {operation = CMD_SELECT, plannedstmt = 0x7fdfd405e050, utilitystmt = 0x0, sourceText = 0x7fdfd2fc3778 "select case when v1='a' then 1 else to_number(v1) end from dual", snapshot = 0x7fdfd3f3ca90, crosscheck_snapshot = 0x0, dest = 0x7fe140767020, params = 0x7fdfd3fad1a0, instrument_options = 0, pstmt_idx = 0, tupDesc = 0x0, estate = 0x7fdfd3f52050, planstate = 0x0, aj_states = 0x0, totaltime = 0x0, executed = false, opteval = false, plan_collecting = false, is_parameterized = false, support_operator_reuse = false, psrc_version = -1, stmt_name = '\000' <repeats 63 times>, bii_state = 0x0}

#19 0x000056482dd2d9df in ExecInitNodeByType起,往里层的栈就在openGauss源码里搜不到了,这意味着GaussDB可能做了特殊的优化

f 12 里跟到了T_FuncExpr节点,它的上一个节点是T_CaseExpr

c 复制代码
(gdb) f 12
#12 0x000056482dd0e24b in exec_init_expr_rec (node=0x7fdfd2fd79f0, state=0x7fdfd2066b10, resv=0x7fdfd2066c38, resnull=0x7fdfd2066c31, parent_node=0x7fdfd2fd7530) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_expr_internal.cpp:342
342     /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/exec_expr_internal.cpp: No such file or directory.
(gdb) p *node
$3 = {type = T_FuncExpr, selec = 0}
(gdb) p *parent_node
$4 = {type = T_CaseExpr, selec = 0}

f 9 里optbase_eval_const_expressions_funcexpr 这里有个关键字const,猜测可能是常量表达式,这里需要提到一个概念,叫常量折叠,即如果一个函数是immutable/stable的,如果入参是固定的常量,不是来自于表,那么就只用计算一次,将这个函数提前算出来,得到一个新的常量,避免每行都要去进行重复的函数计算。虽然复现用例里是个plsql的变量,但是当这个查询sql执行时,它肯定是不会发生变化的,可以视为这个查询sql的常量。

c 复制代码
#8  optbase_simplify_function (funcid=1701, result_type=1700, result_typmod=-1, result_collid=0, input_collid=0, max_length=0, decimals=0, args_p=0x7fdfda7c42b8, process_args=true, allow_non_const=true, context=0x7fdfda7c4340) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/utils/opt_clauses.cpp:4235

可以看到这里调用了optbase_simplify_function (简单函数?),funcid=1701是numeric_in,即to_number(text)里调用的函数

sql 复制代码
gaussdb=# select prosrc from pg_proc where proname='to_number' and proargtypes::text='25';
                               prosrc
--------------------------------------------------------------------
  select pg_catalog.numeric_in(pg_catalog.textout($1), 0::Oid, -1)
(1 row)

gaussdb=# select proname from pg_proc where oid=1701;
  proname
------------
 numeric_in
(1 row)

到这里,大概能猜到GaussDB做了什么优化了,即在plsql中,对于单条sql语句里出现的plsql变量,也可以视为常量,使用常量折叠的优化逻辑。

把测试用例稍微改改,to_number里传入字面量,放openGauss里执行,触发openGauss的常量折叠逻辑,看是否会出现报错

sql 复制代码
openGauss=# declare
openGauss-# v1 varchar2(20);
openGauss-# v2 number;
openGauss-# begin
openGauss$#     v1:='a';
openGauss$#     for i in 1..6 loop
openGauss$#         select case when v1='a' then 1 else to_number('a') end from sys_dummy into v2;
openGauss$#     raise notice '%',v2;
openGauss$#     end loop;
openGauss$# exception when others then
openGauss$#     raise notice 'v1:%,sqlerrm:%',v1,sqlerrm;
openGauss$# end;
openGauss$# /
NOTICE:
QueryPlan

----------------------------NestLevel:0----------------------------
Query Text:  select pg_catalog.numeric_in(pg_catalog.textout($1), 0::Oid, -1)
Name: dn_6001
Result  (cost=0.00..0.01 rows=1 width=0)
  Output: numeric_in(textout($1), 0::oid, (-1))
param1 value: a type: text


CONTEXT:  SQL function "to_number" statement 1
referenced column: to_number
SQL statement "select case when v1='a' then 1 else to_number('a') end from sys_dummy"
PL/pgSQL function inline_code_block line 6 at SQL statement
NOTICE:  v1:a,sqlerrm:invalid input syntax for type numeric: "a"
ANONYMOUS BLOCK EXECUTE

果不其然,v1='a'的值不确定,没有先进行计算,但因为to_number('a')一定是个明确的值,就先进行了常量折叠,触发了报错。

根因总结

GaussDB相较于openGauss,新增支持了对于含有plsql变量的表达式也能支持常量折叠的优化逻辑,以此优化了存储过程内包含函数的sql执行性能,但是由于常量折叠发生在实际执行甚至执行计划生成之前,导致case when 语句里不会执行到的分支逻辑也会提前进行折叠,从而报错。

这不算BUG,因为如果不结合实际业务、不结合上下文,单看SQL,有时候的确无法判断case when会走哪个分支,这种SQL其实算是不规范SQL了。只是由于GaussDB引入了这个性能优化的特性,可能会导致之前不报错的sql在多执行几次后也会报错了。

规避措施

目前观测到在 505.2.0.SPC0100、506.0.0.SPC0100 版本里均存在这个现象,在503.1.0.SPC1700版本里无此现象。

两种处理方法:

1.去掉 from dual ,让其走单个表达式执行,而非可能有多行结果的完整SQL执行。不过如果是select into 两个变量,就变成了单行执行,此时去掉from dual依然还会报错。

2.加上hint /*+use_cplan*/ ,使其每次都强制走cplan

上面两种推荐用1,即去掉from dual 的方式,这样性能最好,但如果的确是要查询表或者into多个变量的情况,则只能使用加hint的方式了。

但是,最理想的代码,还是不要出现这种类型转换,无论是表里的数据作为参数还是使用变量作为参数,只要值可能为不能转换成数字的字符串,就不要尝试去对其转换成数字!

相关推荐
九章-1 天前
一库平替,融合致胜:国产数据库的“统型”范式革命
数据库·融合数据库
2401_838472511 天前
使用Scikit-learn构建你的第一个机器学习模型
jvm·数据库·python
u0109272711 天前
使用Python进行网络设备自动配置
jvm·数据库·python
wengqidaifeng1 天前
数据结构---顺序表的奥秘(下)
c语言·数据结构·数据库
what丶k1 天前
SpringBoot3 配置文件使用全解析:从基础到实战,解锁灵活配置新姿势
java·数据库·spring boot·spring·spring cloud
Code blocks1 天前
kingbase数据库集成Postgis扩展
数据库·后端
天下·第二1 天前
达梦数据库适配
android·数据库·adb
Dxy12393102161 天前
MySQL INSERT ... ON DUPLICATE KEY UPDATE 批量更新详解
数据库·mysql
定偶1 天前
MySQL知识点
android·数据结构·数据库·mysql
醒过来摸鱼1 天前
Redis 服务器线程与事件循环解析
服务器·数据库·redis