我们的文章会在微信公众号IT民工的龙马人生和博客网站( www.htz.pw )同步更新 ,欢迎关注收藏,也欢迎大家转载,但是请在文章开始地方标注文章出处,谢谢!
由于博客中有大量代码,通过页面浏览效果更佳。
1 环境介绍与现象描述
数据库多块读很慢,单块读、写都很正常,多块读的平均响应时间是单块都的10倍。
本系统数据文件存放在VXFS上面,数据库是从9i通过DG方式升级到11G,9i使用的是raw.
2 分析与测试过程
2.1 AWR中收集单块读与多块读的平均响应时间
下面是从AWR中统计每个快照的单块读与多块读的响应时间
SQL>
>SQL> col EVENT_NAME for a30
SQL> col total_waits for 99999999999999
SQL> col total_time_s for 999999999.999
SQL> col avg_time_ms for 999999999.999
SQL> BREAK ON inst SKIP 1
SQL> select instance_number inst,
2 to_char(time, 'YYYY-MM-DD HH24:MI:SS') time,
3 event_name,
4 sum(delta_total_waits) total_waits,
5 round(sum(delta_time_waited / 1000000), 3) total_time_s,
6 round(sum(delta_time_waited) /
7 decode(sum(delta_total_waits), 0, null, sum(delta_total_waits)) / 1000,
8 3) avg_time_ms
9 from (select hse.instance_number,
10 hse.snap_id,
11 trunc(sysdate - 30 + 1) +
12 trunc((cast(hs.begin_interval_time as date) -
13 (trunc(sysdate - 10 + 1))) * 24 /
14 (1)) * (1) / 24 time,
15 EVENT_NAME,
16 (lead(TOTAL_WAITS, 1)
17 over(partition by hse.instance_number,
18 hs.STARTUP_TIME,
19 EVENT_NAME order by hse.snap_id)) - TOTAL_WAITS delta_total_waits,
20 (lead(TIME_WAITED_MICRO, 1)
21 over(partition by hse.instance_number,
22 hs.STARTUP_TIME,
23 EVENT_NAME order by hse.snap_id)) - TIME_WAITED_MICRO delta_time_waited
24 from DBA_HIST_SYSTEM_EVENT hse, DBA_HIST_SNAPSHOT hs
25 where hse.snap_id = hs.snap_id
26 and hs.begin_interval_time >= trunc(sysdate) - 30 + 1
27 and hse.EVENT_NAME in ('db file sequential read','db file scattered read')) a
28 group by instance_number, time, event_name
29 order by 1, 2, 3;
INST TIME EVENT_NAME TOTAL_WAITS TOTAL_TIME_S AVG_TIME_MS
---------- ------------------- ------------------------------ --------------- -------------- --------------
1 2016-01-21 22:00:00 db file scattered read 116532 8003.375 68.680
2016-01-21 22:00:00 db file sequential read 451345 2617.070 5.798
2016-01-21 23:00:00 db file scattered read 509375 60848.652 119.457
2016-01-21 23:00:00 db file sequential read 1032694 9415.338 9.117
2016-01-22 00:00:00 db file scattered read 511267 55300.596 108.164
2016-01-22 00:00:00 db file sequential read 665804 5953.620 8.942
2016-01-22 01:00:00 db file scattered read 212010 13539.024 63.860
2016-01-22 01:00:00 db file sequential read 368793 3063.742 8.307
2016-01-22 02:00:00 db file scattered read 161036 19300.935 119.855
2016-01-22 02:00:00 db file sequential read 144915 1313.259 9.062
2016-01-22 03:00:00 db file scattered read 646679 84395.856 130.507
2016-01-22 03:00:00 db file sequential read 335820 3152.971 9.389
2016-01-22 04:00:00 db file scattered read 617742 67478.919 109.235
2016-01-22 04:00:00 db file sequential read 288044 2563.102 8.898
2016-01-22 05:00:00 db file scattered read 493513 59254.417 120.067
2016-01-22 05:00:00 db file sequential read 991351 9743.101 9.828
2016-01-22 06:00:00 db file scattered read 467135 47366.993 101.399
2016-01-22 06:00:00 db file sequential read 1511804 13600.312 8.996
2016-01-22 07:00:00 db file scattered read 305532 32673.273 106.939
2016-01-22 07:00:00 db file sequential read 1451034 13513.703 9.313
2016-01-22 08:00:00 db file scattered read 469331 33309.734 70.973
2016-01-22 08:00:00 db file sequential read 1586451 12038.511 7.588
2016-01-22 09:00:00 db file scattered read 409206 26532.131 64.838
2016-01-22 09:00:00 db file sequential read 1876402 11685.365 6.228
2016-01-22 10:00:00 db file scattered read 218379 15000.749 68.691
2016-01-22 10:00:00 db file sequential read 1012738 7195.984 7.105
2016-01-22 11:00:00 db file scattered read 280045 27214.605 97.179
2016-01-22 11:00:00 db file sequential read 775662 5751.906 7.415
2016-01-22 12:00:00 db file scattered read 271739 21270.681 78.276
2016-01-22 12:00:00 db file sequential read 1013600 6862.144 6.770
2016-01-22 13:00:00 db file scattered read 91427 8502.492 92.998
2016-01-22 13:00:00 db file sequential read 994440 6992.188 7.031
2016-01-22 14:00:00 db file scattered read 143148 6822.649 47.662
2016-01-22 14:00:00 db file sequential read 897827 4978.166 5.545
2016-01-22 15:00:00 db file scattered read 80698 4790.735 59.366
2016-01-22 15:00:00 db file sequential read 507671 2940.978 5.793
.................
2.2 手动采集多块读与物理读
以为是AWR收集的数据不准确,所以下面自己写个脚本来收集
--1,创建表
plain_text
drop table system.htz_system_event ;
drop sequence system.htz_system_event_seq;
create sequence system.htz_system_event_seq;
create table system.htz_system_event(start_time date,id number,name varchar2(64),TOTAL_WAITS number,TOTAL_TIMEOUTS number,TIME_WAITED number,AVERAGE_WAIT number,TIME_WAITED_MICRO number,TOTAL_WAITS_FG number,TOTAL_TIMEOUTS_FG number,TIME_WAITED_FG number,AVERAGE_WAIT_FG number,TIME_WAITED_MICRO_FG number) tablespace ODS_MODEL_SETT;
--2,自动作业插入数据,每10分钟插入一次
VARIABLE
BEGIN
DBMS_JOB.SUBMIT (
:jobno,
'DECLARE
v_sequence NUMBER;
BEGIN
SELECT system.htz_system_event_seq.nextval INTO v_sequence FROM DUAL;
insert into system.htz_system_event select sysdate,v_sequence,event,TOTAL_WAITS,TOTAL_TIMEOUTS,TIME_WAITED,AVERAGE_WAIT,TIME_WAITED_MICRO,TOTAL_WAITS_FG,TOTAL_TIMEOUTS_FG,TIME_WAITED_FG,AVERAGE_WAIT_FG,TIME_WAITED_MICRO_FG from v$system_event where wait_class in (''User I/O'',''System I/O'');
END;',
SYSDATE,
'SYSDATE + 10/1440');
COMMIT;
END;
/
PL/SQL procedure successfully completed.
PRINT jobno
SQL> PRINT jobno
JOBNO
--------------------
703
SQL> set echo off
SQL> set lines 200 pages 3000 heading on
SQL> col name for a30
SQL> col start_time for a11
SQL> col TOTAL_WAITS for 9999999999
SQL> col lag_total_waits for 9999999999
SQL> col TIME_WAITED_MICRO for 9999999999
SQL> col lag_TIME_WAITED_MICRO for 9999999999
SQL> col id for 999999
SQL> col AVERAGE_WAIT for 9999.99
SQL> col AVG_WAIT for 9999.99
SQL> SELECT TO_CHAR (start_time, 'dd hh24:mi:ss') start_time,
2 id,
3 NAME,
4 TRUNC (a.TOTAL_WAITS / 10000) TOTAL_WAITS,
5 TRUNC (LAG (total_waits) OVER (PARTITION BY NAME ORDER BY id) / 1000)
6 lag_total_waits,
7 TRUNC (a.TIME_WAITED_MICRO / 1000000) TIME_WAITED_MICRO,
8 TRUNC (
9 LAG (TIME_WAITED_MICRO) OVER (PARTITION BY name ORDER BY id)
10 / 1000000)
11 lag_TIME_WAITED_MICRO,
12 a.AVERAGE_WAIT,
13 ROUND (
14 DECODE (
15 total_waits
16 - LAG (total_waits) OVER (PARTITION BY NAME ORDER BY id),
17 0, NULL,
18 ( ( TIME_WAITED_MICRO
19 - LAG (TIME_WAITED_MICRO)
20 OVER (PARTITION BY name ORDER BY id))
21 / ( total_waits
22 - LAG (total_waits) OVER (PARTITION BY NAME ORDER BY id))
23 / 1000)),
24 2)
25 avg_wait
26 FROM SYSTEM.htz_system_event a
27 WHERE name IN ('db file scattered read')
28 ORDER BY name, start_time
29 /
START_TIME ID NAME TOTAL_WAITS LAG_TOTAL_WAITS TIME_WAITED_MICRO LAG_TIME_WAITED_MICRO AVERAGE_WAIT AVG_WAIT
----------- ------- ------------------------------ ----------- --------------- ----------------- --------------------- ------------ --------
19 11:18:55 1 db file scattered read 160921 124946964 7.76
19 11:28:56 2 db file scattered read 160925 1609219 124950029 124946964 7.76 93.00
19 11:38:57 3 db file scattered read 160927 1609252 124952805 124950029 7.76 107.95
19 11:48:59 4 db file scattered read 160932 1609278 124955488 124952805 7.76 53.90
19 11:59:00 5 db file scattered read 160935 1609328 124957105 124955488 7.76 60.11
19 12:09:02 6 db file scattered read 160939 1609355 124961852 124957105 7.76 108.29
19 12:19:04 7 db file scattered read 160945 1609398 124968591 124961852 7.76 127.37
19 12:29:05 8 db file scattered read 160949 1609451 124974549 124968591 7.76 124.38
19 12:39:07 9 db file scattered read 160953 1609499 124979139 124974549 7.76 131.63
19 12:49:08 10 db file scattered read 160957 1609534 124983884 124979139 7.77 125.27
19 12:59:10 11 db file scattered read 160959 1609572 124986769 124983884 7.77 116.43
19 13:09:12 12 db file scattered read 160969 1609597 124995455 124986769 7.77 90.32
19 13:19:13 13 db file scattered read 160974 1609693 125001595 124995455 7.77 115.52
19 13:29:14 14 db file scattered read 160980 1609746 125008342 125001595 7.77 115.31
19 13:39:16 15 db file scattered read 160984 1609805 125012003 125008342 7.77 92.16
19 13:49:17 16 db file scattered read 160988 1609844 125016167 125012003 7.77 105.47
19 13:59:18 17 db file scattered read 160990 1609884 125018127 125016167 7.77 96.30
19 14:09:20 18 db file scattered read 160994 1609904 125023065 125018127 7.77 109.13
19 14:19:21 19 db file scattered read 160997 1609949 125025060 125023065 7.77 80.29
19 14:29:23 20 db file scattered read 161003 1609974 125028340 125025060 7.77 58.06
19 14:39:24 21 db file scattered read 161008 1610031 125030265 125028340 7.77 37.58
19 14:49:25 22 db file scattered read 161008 1610082 125030809 125030265 7.77 89.22
19 14:59:27 23 db file scattered read 161009 1610088 125031182 125030809 7.77 86.79
这里手动收集出来跟AWR报告差不多.
2.3 查询文件的碎片
环境中使用的VXFS,所以我们查询一下文件的碎片
/opt/VRTS/bin>./fsmap
./fsmap> Volume Extent Type File Offset Extent Size File
vol_data12 Data 0 Bytes 16.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 16.00 KB 128.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 144.00 KB 32.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 176.00 KB 32.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 208.00 KB 32.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 240.00 KB 32.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 272.00 KB 32.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 304.00 KB 32.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 336.00 KB 32.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 368.00 KB 24.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 392.00 KB 32.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 424.00 KB 32.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 456.00 KB 32.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 488.00 KB 32.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 520.00 KB 128.00 KB /data12/oradata/dblv32g_0225.dbf
vol_data12 Data 648.00 KB 128.00 KB /data12/oradata/dblv32g_0225.dbf
............
这里看到文件系统上面每个区的大小都是32KB,文件存在碎片,并且很严重。在VXFS里面文件的空间分配其实跟ORACLE一样的,连续的块构成区,但是区于区之间有可能不连续的。ORACLE一次IO最大是1M,如果一次多块读1M,需要读32个VXFS里面的区,那么一次ORACLE的IO在VXFS里面间接的变成了32次IO。
2.4 手动创建表空间,查看数据文件的碎片
plain_text
下面手动创建一个10G的数据文件,看看文件的碎片如何。
plain_text
SQL> create tablespace htz_test datafile '/data14/oradata/htz_test01.dbf' size 10G autoextend off;
Tablespace created.
[htz.pw:oracle]/oracle/app/oracle/admin/htz/htz>/opt/VRTS/bin/fsmap -aH /data14/oradata/htz_test01.dbf
Volume Extent Type File Offset Extent Size File
vol_data14 Data 0 Bytes 10.00 GB /data14/oradata/htz_test01.dbf
[htz.pw:oracle]/oracle/app/oracle/admin/htz/htz>ls -l /data14/oradata/htz_test01.dbf
-rw-r----- 1 oracle dba 10737426432 Feb 19 14:36 /data14/oradata/htz_test01.dbf
plain_text
可以发现10G的数据文件,只生成了一个区,跟我们之前的数据文件相差很大。
2.5 创建表用于测试IO
plain_text
SQL> create table system.htz_test_table nologging tablespace htz_test parallel 5 as select /*+ parallel(a 5)*/* from HTZ.HTZHTZHTZ a where rownum <1000000;
SQL> create table system.htz_test_table1 tablespace ODS_MODEL as select * from system.htz_test_table;
2.5.1 通过TRUSS,10046来对上面2张表进行分析
旧表空间的表
plain_text
SQL> alter session set "_serial_direct_read"='NEVER';
Session altered.
Elapsed: 00:00:00.00
SQL> @trace_10046_my.sql
Statement processed.
Statement processed.
/oracle/app/oracle/diag/rdbms/puods/puhtz.pw/trace/puhtz.pw_ora_63111800.trc
'oradebug event 10046 trace name context off'
select count(*) from system.htz_test_table1;
truss -Ddfo /expdata/63111800.log -p 63111800
SQL> select count(*) from system.htz_test_table1;
COUNT(*)
----------
63999936
Elapsed: 00:11:18.68
这里看到SQL执行SQL是11分钟
plain_text
SQL> select * from v$session_event where sid in (select sid from v$mystat) ;
SID EVENT TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT MAX_WAIT TIME_WAITED_MICRO EVENT_ID WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
---------- ---------------------------------------- ----------- -------------- ----------- ------------ ---------- ----------------- ---------- ------------- ----------- --------------------
1659 Disk file operations I/O 528 0 1 0 0 7621 166678035 1740759767 8 User I/O
1659 log file sync 1 0 0 .06 0 596 1328744198 3386400367 5 Commit
1659 db file sequential read 4 0 3 .66 1 26371 2652584166 1740759767 8 User I/O
1659 db file scattered read 4543 0 65910 14.51 51 659102331 506183215 1740759767 8 User I/O
1659 SQL*Net message to client 18 0 0 0 0 98 2067390145 2000153315 7 Network
1659 SQL*Net message from client 17 0 24352 1432.5 7812 243524787 1421975091 2723168908 6 Idle
1659 events in waitclass Other 7 0 0 .05 0 3262 1736664284 1893977003 0 Other
7 rows selected.
平均的多块读的时间是14.51 ,单块读的时间是0.66
新表空间的表
plain_text
truss -Ddfo /expdata/45089246.log -p 45089246
SQL> @myspid
SPID
------------------------
45089246
SQL> @mysid.sql
USERENV('SID')
--------------
1659
SQL> alter session set "_serial_direct_read"='NEVER';
Session altered.
SQL> set lines 200
SQL> col wait_class for a20
SQL> col event for a40
SQL> select * from v$session_event where sid in (select sid from v$mystat) ;
SID EVENT TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT MAX_WAIT TIME_WAITED_MICRO EVENT_ID WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
---------- ---------------------------------------- ----------- -------------- ----------- ------------ ---------- ----------------- ---------- ------------- ----------- --------------------
1659 Disk file operations I/O 1 0 0 0 0 12 166678035 1740759767 8 User I/O
1659 log file sync 1 0 0 .1 0 1031 1328744198 3386400367 5 Commit
1659 SQL*Net message to client 11 0 0 0 0 24 2067390145 2000153315 7 Network
1659 SQL*Net message from client 10 0 2594 259.36 935 25935800 1421975091 2723168908 6 Idle
SQL> SQL> set timing on
SQL> @trace_10046_my.sql
Statement processed.
Statement processed.
/oracle/app/oracle/diag/rdbms/puods/puhtz.pw/trace/puhtz.pw_ora_45089246.trc
'oradebug event 10046 trace name context off'
SQL> select count(*) from system.htz_test_table;
COUNT(*)
----------
63999936
Elapsed: 00:00:37.47
这里看到时间是0.37S
SQL>
> SID EVENT TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT MAX_WAIT TIME_WAITED_MICRO EVENT_ID WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
---------- ---------------------------------------- ----------- -------------- ----------- ------------ ---------- ----------------- ---------- ------------- ----------- --------------------
1659 Disk file operations I/O 4 0 0 0 0 69 166678035 1740759767 8 User I/O
1659 log file sync 1 0 0 .1 0 1031 1328744198 3386400367 5 Commit
1659 db file sequential read 9 0 5 .57 1 51512 2652584166 1740759767 8 User I/O
1659 db file scattered read 4540 0 1696 .37 34 16957162 506183215 1740759767 8 User I/O
1659 db file parallel read 5 0 19 3.78 5 188920 834992820 1740759767 8 User I/O
1659 SQL*Net message to client 18 0 0 0 0 69 2067390145 2000153315 7 Network
1659 SQL*Net message from client 17 0 13426 789.77 5328 134261271 1421975091 2723168908 6 Idle
7 rows selected.
Elapsed: 00:00:03.51
多块读的平均时间是0.37,单块读是0.57
2.5.2 TRUSS文件分析
这里没有贴出详细的内容,只给了有问题的函数
plain_text
D:\temp>cat 45089246.log|grep lseek|awk "{print $3}"|awk -F : "{sum+=$1} END {print sum}"
13.9234
D:\temp>cat 63111800.log|grep lseek|awk "{print $3}"|awk -F : "{sum+=$1} END {print sum}"
531.661
D:\temp>cat 63111800.log|grep lseek|wc -l
5044
D:\temp>cat 45089246.log|grep lseek|wc -l
3332
通过TRUSS文件,可以看到整个函数中,主要是由于LSEEK函数消耗的时间差异很大,才导致整个SQL运行很慢。
2.5.3 通过10046时间分析
新表空间的10046统计信息
旧表空间的10046统计信息
3 分析总结
由于9i使用DG方式升级到11G,在9I中还没有odm,所有在还原数据文件到vxfs时,vxfs会尽量将文件分散,这就导致后来的区不连续,多块读很慢。
4,解决方案
解决方案很简单,其实可以通过sf自己的工具在线做,但是sf工程师说目前没有成功的案例,不做,所以就只能通过ORACLE数据的方法了,使用rman backup as copy将数据文件从一个目录copy到另外一个目录,通过调用odm功能向vxfs里面写数据就可以了。
------------------作者介绍-----------------------
姓名:黄廷忠
现就职:Oracle中国高级服务团队
曾就职:OceanBase、云和恩墨、东方龙马等
电话、微信、QQ:18081072613
个人博客: (http://www.htz.pw)
CSDN地址: (https://blog.csdn.net/wwwhtzpw)
博客园地址: (https://www.cnblogs.com/www-htz-pw)