整个表147M,执行时一个CPU耗尽, buffer gets 超过1个G, 启用并行也没有用

今天开发的同事问有个表上的数据为什么删不掉?我看了一下,也就不到100000条数据,表上有外键,等了5分钟hang在那里,时间原因,我对表上的外键禁用后,瞬间删除。
现在来还原这个问题。
sys@ANBOB>select count(*) from bjhr.doctor_exam_member;
COUNT(*)
92102
sys@ANBOB>delete bjhr.doctor_exam_member;
--hang
--等待10分钟都未执行完,检查表的外键信息
bjhr@ANBOB>
SELECT /*+RULE*/
D.CONSTRAINT_NAME pk_name,-- d.table_name,
D.TABLE_NAME || '.' || D.COLUMN_NAME pk_column,
A.CONSTRAINT_TYPE,
B.CONSTRAINT_NAME fk_name,
B.TABLE_NAME || '.' || B.COLUMN_NAME fk_column
FROM user_constraints a
JOIN user_cons_columns b
ON a.constraint_name = b.constraint_name AND a.owner = b.owner
JOIN user_constraints c
ON A.R_CONSTRAINT_NAME = C.CONSTRAINT_NAME AND A.R_OWNER = c.owner
JOIN user_cons_columns d
ON c.constraint_name = d.constraint_name AND c.owner = d.owner
WHERE D.table_name = 'DOCTOR_EXAM_MEMBER'
bjhr@ANBOB>/
PK_NAME PK_COLUMN C FK_NAME FK_COLUMN
-------------------- ---------------------------------------- - ------------------------------ -------------------------------------------------------
PK_DOCTOR_EXAM_MEMBE DOCTOR_EXAM_MEMBER.DOCTOR_EXAM_MEMBER_ID R FK_RESULT_N_REFERENCE_DOCTOR RESULT_NOTIFICATION_RECORD.DOCTOR_EXAM_MEMBER_ID
PK_DOCTOR_EXAM_MEMBE DOCTOR_EXAM_MEMBER.DOCTOR_EXAM_MEMBER_ID R RESULT_RE_DOCTOR_MEMBER DOCTOR_EXAM_RESULT.DOCTOR_EXAM_MEMBER_ID
--有外键,之前已对子表进行过删除,否则会报错ORA-02266
delete RESULT_NOTIFICATION_RECORD;
delete DOCTOR_EXAM_RESULT;
commit;
--下面开始分析,创建新的session
sys@ANBOB>select xidsqn,xidusn,object_id,session_id,locked_mode from v$locked_object;
XIDSQN XIDUSN OBJECT_ID SESSION_ID LOCKED_MODE
-------------------- -------------------- -------------------- -------------------- --------------------
2102 203 1639631 2290 3
2102 203 1639572 2290 3
sys@ANBOB>select object_name,object_type from dba_objects where object_id in(1639631,1639572);
OBJECT_NAME OBJECT_TYPE
------------------------------ -------------------
DOCTOR_EXAM_MEMBER TABLE
DOCTOR_EXAM_RESULT TABLE
sys@ANBOB>select event,p1,p2,p1text,p2text,seconds_in_wait,state from v$session_wait where sid=2290;
EVENT P1 P2 P1TEXT P2TEXT SECONDS_IN_WAIT STATE
------------------------------ ----------- ----- -------------------- -------------------- -------------------- -------------------
latch: shared pool 1611704464 307 address number 213 WAITED SHORT TIME
--trace hanganalyze and systemstate
alter session set events 'immediate trace name systemstate level 266';
alter session set events 'immediate trace name hanganalyze level 3';
--hanganalyze trace
===============================================================================
Chains most likely to have caused the hang:
[a] Chain 1 Signature:
Chain 1 Signature Hash: 0x673a0128
[b] Chain 2 Signature: 'Streams AQ: waiting for messages in the queue'
Chain 2 Signature Hash: 0xa00e2e87
===============================================================================
Sessions in an involuntary wait or not in a wait:
-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (ANBOB.ANBOB)
os id: 27158
process id: 94, oracle@dev-db (TNS V1-V3)
session id: 2290
session serial #: 7981
}
is not in a wait:
{
last wait: 11 min 0 sec ago
blocking: 0 sessions
wait history:
1. event: 'latch: shared pool'
time waited: 0.000114 sec
wait id: 183 p1: 'address'=0x6010a890
p2: 'number'=0x133
p3: 'tries'=0x0
* time between wait #1 and #2: 1.586255 sec
2. event: 'latch: shared pool'
time waited: 0.000032 sec
wait id: 182 p1: 'address'=0x6010a890
p2: 'number'=0x133
p3: 'tries'=0x0
* time between wait #2 and #3: 0.133830 sec
3. event: 'latch: shared pool'
time waited: 0.000114 sec
wait id: 181 p1: 'address'=0x6010a890
p2: 'number'=0x133
p3: 'tries'=0x0
}
Chain 1 Signature:
Chain 1 Signature Hash: 0x673a0128
--对systemstate 没发现可疑信息
oracle@dev-db \~\]$ awk -f ass109.awk /oracle/diag/rdbms/ANBOB/ANBOB/trace/ANBOB_ora_23020.trc --- 奇怪为什么会发生在latch:shared pool上? 应该是sql解析和shared pool相关的事件,随后结束delete,做10046 观察究竟 sys@ANBOB\>oradebug setmypid; Statement processed. sys@ANBOB\>oradebug event 10046 trace name context forever,level 12 Statement processed. sys@ANBOB\>delete bjhr.doctor_exam_member; 92102 rows deleted. sys@ANBOB\>oradebug tracefile_name /u01/app/oracle/diag/rdbms/anbob/anbob/trace/anbob_ora_7784.trc sys@ANBOB\>oradebug event 10046 trace name context off; Statement processed. --- 格式化trace,终于发现了答案. ``` delete bjhr.doctor_exam_member call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 47.30 48.39 201 222 657611 92102 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 47.31 48.39 201 222 657611 92102 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ Disk file operations I/O 2 0.00 0.00 db file scattered read 26 0.00 0.00 db file sequential read 24 0.00 0.00 SQL*Net message to client 1 0.00 0.00 SQL*Net message from client 1 0.00 0.00 ******************************************************************************** -- check deferred objects select pctfree_stg, pctused_stg, size_stg,initial_stg, next_stg, minext_stg, maxext_stg, maxsiz_stg, lobret_stg,mintim_stg, pctinc_stg, initra_stg, maxtra_stg, optimal_stg, maxins_stg,frlins_stg, flags_stg, bfp_stg, enc_stg, cmpflag_stg, cmplvl_stg from deferred_stg$ where obj# =:1 ******************************************************************************** select /*+ all_rows */ count(1) from "BJHR"."RESULT_NOTIFICATION_RECORD" where "DOCTOR_EXAM_MEMBER_ID" = :1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 92102 11.31 11.34 0 0 0 0 Fetch 92102 0.63 0.64 0 0 0 92102 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 184205 11.95 11.99 0 0 0 92102 Misses in library cache during parse: 1 Misses in library cache during execute: 1 Optimizer mode: ALL_ROWS Parsing user id: SYS (recursive depth: 1) Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 SORT AGGREGATE (cr=0 pr=0 pw=0 time=43 us) 0 0 0 TABLE ACCESS FULL RESULT_NOTIFICATION_RECORD (cr=0 pr=0 pw=0 time=12 us cost=3 size=5 card=1) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ latch: shared pool 2 0.00 0.00 ******************************************************************************** select /*+ all_rows */ count(1) from "BJHR"."DOCTOR_EXAM_RESULT" where "DOCTOR_EXAM_MEMBER_ID" = :1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 92102 6.97 7.11 0 0 0 0 Fetch 92102 1012.96 1016.14 0 566243096 92102 92102 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 184205 1019.93 1023.25 0 566243096 92102 92102 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 SORT AGGREGATE (cr=6148 pr=0 pw=0 time=30196 us) 0 0 0 TABLE ACCESS FULL DOCTOR_EXAM_RESULT (cr=6148 pr=0 pw=0 time=30184 us cost=1647 size=5 card=1) ******************************************************************************** OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 39 0.00 0.01 0 0 0 0 Execute 184248 18.29 18.46 0 0 0 0 Fetch 184276 1013.60 1016.79 0 566243218 92102 184238 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 368563 1031.90 1035.27 0 566243218 92102 184238 1129 elapsed seconds in trace file. ``` TIP: 在删除doctor_exam_member表时,检查了他的所有参照表(子表),然后对doctor_exam_member表的每次记录都要去参照表查询是否存在,此时刚好参考表的外键列上并无索引,导致每一行记录都会导致FTS(full table scan),这也是查询v$session_event时偶尔出现latch: CBC (hot block)的原因。 你可能疑问子表数据都delete了为什么还查询这么久?我做个小测试 sys@ORA10GR2\>select count(\*) from bjhr_dev.DOCTOR_EXAM_RESULT; COUNT(\*) -------------------- 0 sys@ORA10GR2\>select bytes,blocks from dba_segments where segment_name='DOCTOR_EXAM_RESULT' and owner='BJHR_DEV'; BYTES BLOCKS -------------------- -------------------- 50331648 6144 sys@ORA10GR2\>set autot trace stat sys@ORA10GR2\>select count(\*) from bjhr_dev.DOCTOR_EXAM_RESULT where DOCTOR_EXAM_MEMBER_ID=1; Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 6040 consistent gets 0 physical reads 0 redo size 514 bytes sent via SQL\*Net to client 492 bytes received via SQL\*Net from client 2 SQL\*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed sys@ORA10GR2\>alter table bjhr_dev.doctor_exam_result enable row movement; Table altered. sys@ORA10GR2\>alter table bjhr_dev.DOCTOR_EXAM_RESULT shrink space; Table altered. sys@ORA10GR2\>alter table bjhr_dev.doctor_exam_result disable row movement; Table altered. sys@ORA10GR2\>select bytes,blocks from dba_segments where segment_name='DOCTOR_EXAM_RESULT' and owner='BJHR_DEV'; BYTES BLOCKS -------------------- -------------------- 196608 24 sys@ORA10GR2\>select count(\*) from bjhr_dev.DOCTOR_EXAM_RESULT where DOCTOR_EXAM_MEMBER_ID=1; Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 514 bytes sent via SQL\*Net to client 492 bytes received via SQL\*Net from client 2 SQL\*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed TIP: FTS查询会遍历表segment 已格式化过所有data block. Summary: 在建有外键约束的子表列上需要创建索引,对子表全表删除时可以采用truncate 或delete(有外键不能truncate时)后对表进行shrink space操作,或删除父表前对子表的外键约束做Disable.