整个表147M,执行时一个CPU耗尽, buffer gets 超过1个G, 启用并行也没有用
今天开发的同事问有个表上的数据为什么删不掉?我看了一下,也就不到100000条数据,表上有外键,等了5分钟hang在那里,时间原因,我对表上的外键禁用后,瞬间删除。
现在来还原这个问题。
sys@ANBOB>select count(*) from bjhr.doctor_exam_member;
COUNT(*)
92102
sys@ANBOB>delete bjhr.doctor_exam_member;
--hang
--等待10分钟都未执行完,检查表的外键信息
bjhr@ANBOB>
SELECT /*+RULE*/
D.CONSTRAINT_NAME pk_name,-- d.table_name,
D.TABLE_NAME || '.' || D.COLUMN_NAME pk_column,
A.CONSTRAINT_TYPE,
B.CONSTRAINT_NAME fk_name,
B.TABLE_NAME || '.' || B.COLUMN_NAME fk_column
FROM user_constraints a
JOIN user_cons_columns b
ON a.constraint_name = b.constraint_name AND a.owner = b.owner
JOIN user_constraints c
ON A.R_CONSTRAINT_NAME = C.CONSTRAINT_NAME AND A.R_OWNER = c.owner
JOIN user_cons_columns d
ON c.constraint_name = d.constraint_name AND c.owner = d.owner
WHERE D.table_name = 'DOCTOR_EXAM_MEMBER'
bjhr@ANBOB>/
PK_NAME PK_COLUMN C FK_NAME FK_COLUMN
-------------------- ---------------------------------------- - ------------------------------ -------------------------------------------------------
PK_DOCTOR_EXAM_MEMBE DOCTOR_EXAM_MEMBER.DOCTOR_EXAM_MEMBER_ID R FK_RESULT_N_REFERENCE_DOCTOR RESULT_NOTIFICATION_RECORD.DOCTOR_EXAM_MEMBER_ID
PK_DOCTOR_EXAM_MEMBE DOCTOR_EXAM_MEMBER.DOCTOR_EXAM_MEMBER_ID R RESULT_RE_DOCTOR_MEMBER DOCTOR_EXAM_RESULT.DOCTOR_EXAM_MEMBER_ID
--有外键,之前已对子表进行过删除,否则会报错ORA-02266
delete RESULT_NOTIFICATION_RECORD;
delete DOCTOR_EXAM_RESULT;
commit;
--下面开始分析,创建新的session
sys@ANBOB>select xidsqn,xidusn,object_id,session_id,locked_mode from v$locked_object;
XIDSQN XIDUSN OBJECT_ID SESSION_ID LOCKED_MODE
-------------------- -------------------- -------------------- -------------------- --------------------
2102 203 1639631 2290 3
2102 203 1639572 2290 3
sys@ANBOB>select object_name,object_type from dba_objects where object_id in(1639631,1639572);
OBJECT_NAME OBJECT_TYPE
------------------------------ -------------------
DOCTOR_EXAM_MEMBER TABLE
DOCTOR_EXAM_RESULT TABLE
sys@ANBOB>select event,p1,p2,p1text,p2text,seconds_in_wait,state from v$session_wait where sid=2290;
EVENT P1 P2 P1TEXT P2TEXT SECONDS_IN_WAIT STATE
------------------------------ ----------- ----- -------------------- -------------------- -------------------- -------------------
latch: shared pool 1611704464 307 address number 213 WAITED SHORT TIME
--trace hanganalyze and systemstate
alter session set events 'immediate trace name systemstate level 266';
alter session set events 'immediate trace name hanganalyze level 3';
--hanganalyze trace
===============================================================================
Chains most likely to have caused the hang:
[a] Chain 1 Signature:
Chain 1 Signature Hash: 0x673a0128
[b] Chain 2 Signature: 'Streams AQ: waiting for messages in the queue'
Chain 2 Signature Hash: 0xa00e2e87
===============================================================================
Sessions in an involuntary wait or not in a wait:
-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (ANBOB.ANBOB)
os id: 27158
process id: 94, oracle@dev-db (TNS V1-V3)
session id: 2290
session serial #: 7981
}
is not in a wait:
{
last wait: 11 min 0 sec ago
blocking: 0 sessions
wait history:
1. event: 'latch: shared pool'
time waited: 0.000114 sec
wait id: 183 p1: 'address'=0x6010a890
p2: 'number'=0x133
p3: 'tries'=0x0
* time between wait #1 and #2: 1.586255 sec
2. event: 'latch: shared pool'
time waited: 0.000032 sec
wait id: 182 p1: 'address'=0x6010a890
p2: 'number'=0x133
p3: 'tries'=0x0
* time between wait #2 and #3: 0.133830 sec
3. event: 'latch: shared pool'
time waited: 0.000114 sec
wait id: 181 p1: 'address'=0x6010a890
p2: 'number'=0x133
p3: 'tries'=0x0
}
Chain 1 Signature:
Chain 1 Signature Hash: 0x673a0128
--对systemstate 没发现可疑信息
[oracle@dev-db ~]$ awk -f ass109.awk /oracle/diag/rdbms/ANBOB/ANBOB/trace/ANBOB_ora_23020.trc
--- 奇怪为什么会发生在latch:shared pool上? 应该是sql解析和shared pool相关的事件,随后结束delete,做10046 观察究竟
sys@ANBOB>oradebug setmypid;
Statement processed.
sys@ANBOB>oradebug event 10046 trace name context forever,level 12
Statement processed.
sys@ANBOB>delete bjhr.doctor_exam_member;
92102 rows deleted.
sys@ANBOB>oradebug tracefile_name
/u01/app/oracle/diag/rdbms/anbob/anbob/trace/anbob_ora_7784.trc
sys@ANBOB>oradebug event 10046 trace name context off;
Statement processed.
--- 格式化trace,终于发现了答案.
delete bjhr.doctor_exam_member
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 47.30 48.39 201 222 657611 92102
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 47.31 48.39 201 222 657611 92102
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
Disk file operations I/O 2 0.00 0.00
db file scattered read 26 0.00 0.00
db file sequential read 24 0.00 0.00
SQL*Net message to client 1 0.00 0.00
SQL*Net message from client 1 0.00 0.00
********************************************************************************
-- check deferred objects
select pctfree_stg, pctused_stg, size_stg,initial_stg, next_stg, minext_stg,
maxext_stg, maxsiz_stg, lobret_stg,mintim_stg, pctinc_stg, initra_stg,
maxtra_stg, optimal_stg, maxins_stg,frlins_stg, flags_stg, bfp_stg, enc_stg,
cmpflag_stg, cmplvl_stg
from
deferred_stg$ where obj# =:1
********************************************************************************
select /*+ all_rows */ count(1)
from
"BJHR"."RESULT_NOTIFICATION_RECORD" where "DOCTOR_EXAM_MEMBER_ID" = :1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 92102 11.31 11.34 0 0 0 0
Fetch 92102 0.63 0.64 0 0 0 92102
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 184205 11.95 11.99 0 0 0 92102
Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS (recursive depth: 1)
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
1 1 1 SORT AGGREGATE (cr=0 pr=0 pw=0 time=43 us)
0 0 0 TABLE ACCESS FULL RESULT_NOTIFICATION_RECORD (cr=0 pr=0 pw=0 time=12 us cost=3 size=5 card=1)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
latch: shared pool 2 0.00 0.00
********************************************************************************
select /*+ all_rows */ count(1)
from
"BJHR"."DOCTOR_EXAM_RESULT" where "DOCTOR_EXAM_MEMBER_ID" = :1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 92102 6.97 7.11 0 0 0 0
Fetch 92102 1012.96 1016.14 0 566243096 92102 92102
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 184205 1019.93 1023.25 0 566243096 92102 92102
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
1 1 1 SORT AGGREGATE (cr=6148 pr=0 pw=0 time=30196 us)
0 0 0 TABLE ACCESS FULL DOCTOR_EXAM_RESULT (cr=6148 pr=0 pw=0 time=30184 us cost=1647 size=5 card=1)
********************************************************************************
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 39 0.00 0.01 0 0 0 0
Execute 184248 18.29 18.46 0 0 0 0
Fetch 184276 1013.60 1016.79 0 566243218 92102 184238
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 368563 1031.90 1035.27 0 566243218 92102 184238
1129 elapsed seconds in trace file.
TIP:
在删除doctor_exam_member表时,检查了他的所有参照表(子表),然后对doctor_exam_member表的每次记录都要去参照表查询是否存在,此时刚好参考表的外键列上并无索引,导致每一行记录都会导致FTS(full table scan),这也是查询v$session_event时偶尔出现latch: CBC (hot block)的原因。
你可能疑问子表数据都delete了为什么还查询这么久?我做个小测试
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT;
COUNT(*)
0
sys@ORA10GR2>select bytes,blocks from dba_segments where segment_name='DOCTOR_EXAM_RESULT' and owner='BJHR_DEV';
BYTES BLOCKS
50331648 6144
sys@ORA10GR2>set autot trace stat
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT where DOCTOR_EXAM_MEMBER_ID=1;
Statistics
0 recursive calls
0 db block gets
6040 consistent gets
0 physical reads
0 redo size
514 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
sys@ORA10GR2>alter table bjhr_dev.doctor_exam_result enable row movement;
Table altered.
sys@ORA10GR2>alter table bjhr_dev.DOCTOR_EXAM_RESULT shrink space;
Table altered.
sys@ORA10GR2>alter table bjhr_dev.doctor_exam_result disable row movement;
Table altered.
sys@ORA10GR2>select bytes,blocks from dba_segments where segment_name='DOCTOR_EXAM_RESULT' and owner='BJHR_DEV';
BYTES BLOCKS
196608 24
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT where DOCTOR_EXAM_MEMBER_ID=1;
Statistics
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
514 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
TIP:
FTS查询会遍历表segment 已格式化过所有data block.
Summary:
在建有外键约束的子表列上需要创建索引,对子表全表删除时可以采用truncate 或delete(有外键不能truncate时)后对表进行shrink space操作,或删除父表前对子表的外键约束做Disable.