redo 产生增加很多

To help with diagnosing excessive redo generation.

First, some background on redo generation:

What is the Online Redo Log?

The most crucial structure for recovery operations is the online redo log,

which consists of two or more pre-allocated files that store all changes made

to the database as they occur. Every instance of an Oracle database has an

associated online redo log to protect the database in case of an instance

failure.

What are the contents of the Online Redo Log?

Online redo log files are filled with redo records. A redo record, also called

a redo entry, is made up of a group of change vectors, each of which is a

description of a change made to a single block in the database. For example,

if you change a salary value in an employee table, you generate a redo record

containing change vectors that describe changes to the data segment block for

the table, the rollback segment data block, and the transaction table of the

rollback segments.

Redo entries record data that you can use to reconstruct all changes made to

the database, including the rollback segments. Therefore, the online redo log

also protects rollback data. When you recover the database using redo data,

Oracle reads the change vectors in the redo records and applies the changes

to the relevant blocks.

TROUBLESHOOTING STEPS

Scope & Application

You experience a large amount of redo being generated while performing a DML

even if "NOLOGGING" is used.

For assistance with determining what session is causing excessive redo, please see
Note:167492.1 titled "How to Find Sessions Generating Lots of Redo".

How to diagnose excessive redo generation.

Check the following:

Confirm if the table or index has "NOLOGGING" set.

Issue the following statement.

select table_name,logging from all_tables where table_name = <table name>;

-or-

select table_name,logging from all_indexes where index_name = <index name>;

Table has no triggers that might cause some indirect DML on other tables.
Auditing is not the contributor for this excessive redo generation.
The tablespace is not in hot backup mode.
Note that only the following operations can make use of NOLOGGING mode:

direct load (SQL*Loader)
direct-load INSERT
CREATE TABLE ... AS SELECT
CREATE INDEX
ALTER TABLE ... MOVE PARTITION
ALTER TABLE ... SPLIT PARTITION
ALTER INDEX ... SPLIT PARTITION
ALTER INDEX ... REBUILD
ALTER INDEX ... REBUILD PARTITION
INSERT, UPDATE, and DELETE on LOBs in NOCACHE NOLOGGING mode stored out of line

Consider the following illustration.

Both tables below have "nologging" set at table level.

SQL> desc redo1

Name Null? Type

X NUMBER

Y NUMBER

SQL> desc redotesttab

Name Null? Type

X NUMBER

Y NUMBER

begin

for x in 1..10000 loop

insert into scott.redotesttab values(x,x+1);

-- or

-- insert /*+ APPEND */ into scott.redotesttab values(x,x+1);

end loop;

end;

Note: This will generate redo even if you provide the hint because this

is not a direct-load insert.

Now, consider the following bulk inserts, direct and simple.

SQL> select name,value from v$sysstat where name like '%redo size%';

NAME VALUE

redo size 27556720

SQL> insert into scott.redo1 select * from scott.redotesttab;

50000 rows created.

SQL> select name,value from v$sysstat where name like '%redo size%';

NAME VALUE

redo size 28536820

SQL> insert /*+ APPEND */ into scott.redo1 select * from scott.redotesttab;

50000 rows created.

SQL> select name,value from v$sysstat where name like '%redo size%';

NAME VALUE

redo size 28539944

You will notice that the redo generated via the simple insert is "980100" while

a direct insert generates only "3124".

Purpose of this document is to show how to identify the causes of excessive redo generation and what we can do to mitigate the problem

TROUBLESHOOTING STEPS

First of all, we need to remark that high redo generation is always a consequence of certain activity in the database and it is expected behavior, oracle is optimized for redo generation and there are no bugs regarding the issue.

The main cause of high redo generation is usually a high DML activity during a certain period of time and it's a good practice to first examine modifications on either database level (parameters, any maintenance operations,...) and application level (deployment of new application, modification in the code, increase in the users,..).

What we need to examine:

Is supplemental logging enabled? The amount of redo generated when supplemental logging is enabled is quite high when compared to when supplemental logging is disabled.

What Causes High Redo When Supplemental Logging is Enabled (Doc ID 1349037.1)

Are a lot of indexes being used?, reducing the number of indexes or using the attribute NOLOGGING will reduce the redo considerably
Are all the operation really in need of the use of LOGGING? From application we can reduce redo by making use of the clause NOLOGGING. Note that only the following operations can make use of NOLOGGING mode:

direct load (SQL*Loader)
direct-load INSERT
CREATE TABLE ... AS SELECT
CREATE INDEX
ALTER TABLE ... MOVE PARTITION
ALTER TABLE ... SPLIT PARTITION
ALTER INDEX ... SPLIT PARTITION
ALTER INDEX ... REBUILD
ALTER INDEX ... REBUILD PARTITION
INSERT, UPDATE, and DELETE on LOBs in NOCACHE NOLOGGING mode stored out of line

To confirm if the table or index has "NOLOGGING" set.

Issue the following statement.

select table_name,logging from all_tables where table_name = <table name>;

-or-

select table_name,logging from all_indexes where index_name = <index name>;

Do tables have triggers that might cause some indirect DML on other tables?
Is Auditing enabled the contributor for this excessive redo generation?
Are tablespaces in hot backup mode?
Examine the log switches:

select lg.group#,lg.bytes/1024/1024 mb, lg.status, lg.archived,lf.member

from v $logfile lf, v$ log lg where lg.group# = lf.group# order by 1, 2;

select to_char(first_time,'YYYY-MON-DD') "Date", to_char(first_time,'DY') day,

to_char(sum(decode(to_char(first_time,'HH24'),'00',1,0)),'999') "00",

to_char(sum(decode(to_char(first_time,'HH24'),'01',1,0)),'999') "01",

to_char(sum(decode(to_char(first_time,'HH24'),'02',1,0)),'999') "02",

to_char(sum(decode(to_char(first_time,'HH24'),'03',1,0)),'999') "03",

to_char(sum(decode(to_char(first_time,'HH24'),'04',1,0)),'999') "04",

to_char(sum(decode(to_char(first_time,'HH24'),'05',1,0)),'999') "05",

to_char(sum(decode(to_char(first_time,'HH24'),'06',1,0)),'999') "06",

to_char(sum(decode(to_char(first_time,'HH24'),'07',1,0)),'999') "07",

to_char(sum(decode(to_char(first_time,'HH24'),'08',1,0)),'999') "08",

to_char(sum(decode(to_char(first_time,'HH24'),'09',1,0)),'999') "09",

to_char(sum(decode(to_char(first_time,'HH24'),'10',1,0)),'999') "10",

to_char(sum(decode(to_char(first_time,'HH24'),'11',1,0)),'999') "11",

to_char(sum(decode(to_char(first_time,'HH24'),'12',1,0)),'999') "12",

to_char(sum(decode(to_char(first_time,'HH24'),'13',1,0)),'999') "13",

to_char(sum(decode(to_char(first_time,'HH24'),'14',1,0)),'999') "14",

to_char(sum(decode(to_char(first_time,'HH24'),'15',1,0)),'999') "15",

to_char(sum(decode(to_char(first_time,'HH24'),'16',1,0)),'999') "16",

to_char(sum(decode(to_char(first_time,'HH24'),'17',1,0)),'999') "17",

to_char(sum(decode(to_char(first_time,'HH24'),'18',1,0)),'999') "18",

to_char(sum(decode(to_char(first_time,'HH24'),'19',1,0)),'999') "19",

to_char(sum(decode(to_char(first_time,'HH24'),'20',1,0)),'999') "20",

to_char(sum(decode(to_char(first_time,'HH24'),'21',1,0)),'999') "21",

to_char(sum(decode(to_char(first_time,'HH24'),'22',1,0)),'999') "22",

to_char(sum(decode(to_char(first_time,'HH24'),'23',1,0)),'999') "23" ,

count(*) Total from v$log_history group by to_char(first_time,'YYYY-MON-DD'), to_char(first_time,'DY')

order by to_date(to_char(first_time,'YYYY-MON-DD'),'YYYY-MON-DD')

This will give us an idea of the times when the high peaks of redo are happening

Examine AWR report:

Next step will be examining the AWR from the hour where we have had the highest number of log switches, and confirm with the redo size that these log switches are actually caused by a lot of redo generation.

In the AWR we can also see the sql with most of the gets/executions to have an idea of the activity that is happening in the database and generating redo and we can also see the segments with the biggest number of block changes and the sessions performing these changes.

Another way to find these sessions is described in SQL: How to Find Sessions Generating Lots of Redo or Archive logs (Doc ID 167492.1)

To find these segments we can also use queries:

SELECT to_char(begin_interval_time,'YY-MM-DD HH24') snap_time,

dhso.object_name,

sum(db_block_changes_delta) BLOCK_CHANGED

FROM dba_hist_seg_stat dhss,

dba_hist_seg_stat_obj dhso,

dba_hist_snapshot dhs

WHERE dhs.snap_id = dhss.snap_id

AND dhs.instance_number = dhss.instance_number

AND dhss.obj# = dhso.obj#

AND dhss.dataobj# = dhso.dataobj#

AND begin_interval_time BETWEEN to_date('11-01-28 13:00','YY-MM-DD HH24:MI') <<<<<<<<<<<< Need to modify the time as per the above query where more redo log switch happened (keep it for 1 hour)

AND to_date('11-01-28 14:00','YY-MM-DD HH24:MI') <<<<<<<<<<<< Need to modify the time as per the above query where more redo log switch happened (interval shld be only 1 hour)

GROUP BY to_char(begin_interval_time,'YY-MM-DD HH24'),

dhso.object_name

HAVING sum(db_block_changes_delta) > 0

ORDER BY sum(db_block_changes_delta) desc ;

-- Then : What SQL was causing redo log generation :

SELECT to_char(begin_interval_time,'YYYY_MM_DD HH24') WHEN,

dbms_lob.substr(sql_text,4000,1) SQL,

dhss.instance_number INST_ID,

dhss.sql_id,

executions_delta exec_delta,

rows_processed_delta rows_proc_delta

FROM dba_hist_sqlstat dhss,

dba_hist_snapshot dhs,

dba_hist_sqltext dhst

WHERE upper(dhst.sql_text) LIKE '%<segment_name>%' >>>>>>>>>>>>>>>>>> Update the segment name as per the result of previous query result

AND ltrim(upper(dhst.sql_text)) NOT LIKE 'SELECT%'

AND dhss.snap_id=dhs.snap_id

AND dhss.instance_number=dhs.instance_number

AND dhss.sql_id=dhst.sql_id

AND begin_interval_time BETWEEN to_date('11-01-28 13:00','YY-MM-DD HH24:MI') >>>>>>>>>>>> Update time frame as required

AND to_date('11-01-28 14:00','YY-MM-DD HH24:MI') >>>>>>>>>>>> Update time frame as required

Finally, to troubleshoot further the issue and know the exact commands are being recorded at that particular time frame we can use log miner and mine the archivelog from the concerned time frame. We can look on v$archived_log and find the archived log generated at that particular time frame.

How To Determine The Cause Of Lots Of Redo Generation Using LogMiner (Doc ID 300395.1)

SYMPTOMS

NOTE: In the images and/or the document content below, the user information and data used represents fictitious data from the Oracle sample schema(s) or Public Documentation delivered with an Oracle database product. Any similarity to actual persons, living or dead, is purely coincidental and not intended in any manner.

After migrated from 11.2 to 12c, large redo size is generated when set autotrace on for query

12c:

Statistics
----------------------------------------------------------
15458 recursive calls
2 db block gets
486798 consistent gets
792525 physical reads
26519728 redo size *<<<<<<<<<<----------
912915 bytes sent via SQL*Net to client
1254 bytes received via SQL*Net from client
68 SQL*Net roundtrips to/from client
37 sorts (memory)
1 sorts (disk)
1000 rows processed

11.2:

Statistics
----------------------------------------------------------
2719 recursive calls
2 db block gets
410794 consistent gets
758747 physical reads
0 redo size *<<<<<<<<<<----------
899477 bytes sent via SQL*Net to client
1226 bytes received via SQL*Net from client
68 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
1000 rows processed

The large redo size is still generated even though unified auditing was disabled
The log miner results show some INTERNAL operations besides the INSERT and DELETE of PLAN_TABLE$

For example:

SCN TIMESTAMP SEG_NAME SEG_OWNER TABLE_NAME TABLE_SPACE OPERATION SQL_REDO
1.1115E+13 8/2/2018 10:05 PLAN_TABLE $SYS PLAN_TABLE$ SYSTEM INSERT /* No SQL_REDO for temporary tables */
1.1115E+13 8/2/2018 10:05 PLAN_TABLE $SYS PLAN_TABLE$ SYSTEM DELETE /* No SQL_REDO for temporary tables */
1.1115E+13 8/2/2018 10:05　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　INTERNAL

This problem can be reproduced by simple query as following, and the statistics of v$mystat shows that redo size for lost write detection is generated:

SQL> conn <USERNAME>/<Password> as sysdba
Connected.
SQL> SELECT name, value
FROM v $mystat, v$ statname
WHERE v $mystat.statistic#=v$ statname.statistic#
and v$statname.name like '%redo size%'
ORDER BY 1;
2 3 4 5
NAME VALUE
---------------------------------------------------------------- ----------
redo size 0
redo size for direct writes 0
redo size for lost write detection 0

SQL> select count(*) from t1;

COUNT(*)
----------
316777

SQL> SELECT name, value
FROM v $mystat, v$ statname
WHERE v $mystat.statistic#=v$ statname.statistic#
and v$statname.name like '%redo size%'
ORDER BY 1;
2 3 4 5
NAME VALUE
---------------------------------------------------------------- ----------
redo size 502756
redo size for direct writes 0
redo size for lost write detection 502756

CHANGES

In 12c environment, the initialization parameter DB_LOST_WRITE_PROTECT is set to non-default value from default value of NONE.

CAUSE

From 11.1 onwards, if DB_LOST_WRITE_PROTECT is set to non-default value of TYPICAL or FULL, lost write detection will be enabled Then buffer cache reads are logged which leads to generation of redo for selects. This is necessary for lost write detection. This is expected behavior. For further information of lost write detection related functionality, please refer to following Note and online documentation.

Best Practices for Corruption Detection, Prevention, and Automatic Repair - in a Data Guard Configuration Document 1302539.1

Database Reference

DB_LOST_WRITE_PROTECT

SOLUTION

No action need to take if lost write detection is needed.

If the environment is not in a Data Guard Configuration or lost write detection is not needed, please set DB_LOST_WRITE_PROTECT to default value of NONE.

SQL> conn <USERNAME>/<PASSWORD> as sysdba

Connected.

SQL> alter system set DB_LOST_WRITE_PROTECT=none;

System altered.

SQL> conn <USERNAME>/<PASSWORD> as sysdba

Connected.

SQL> SELECT name, value

FROM v $mystat, v$ statname

WHERE v $mystat.statistic#=v$ statname.statistic#

and v$statname.name like '%redo size%'

ORDER BY 1;

2 3 4 5

NAME VALUE

redo size 0

redo size for direct writes 0

redo size for lost write detection 0

SQL> select count(*) from t1;

COUNT(*)

316777

SQL> SELECT name, value

FROM v $mystat, v$ statname

WHERE v $mystat.statistic#=v$ statname.statistic#

and v$statname.name like '%redo size%'

ORDER BY 1;

2 3 4 5

NAME VALUE

redo size 0

redo size for direct writes 0

redo size for lost write detection 0

--- How to determine the cause of lots of redo generation using LogMiner ---

Using OPERATION Codes to Understand Redo Information

There are multiple operation codes which can generate the redo information, using following guide lines you can identify the operation codes which are causing the high redo generation and you need to take an appropriate action on it to reduce the high redo generation.

NOTE:

Redo records are not all equally sized. So remember that just because certain statements show up a lot in the LogMiner output, this does not guarantee that you have found the area of functionality generating the excessive redo.

What are these OPERATION codes ?

INSERT / UPDATE / DELETE -- Operations are performed on SYS objects are also considered as an Internal Operations.
COMMIT -- This is also "Internal" operation, you will get line "commit;" in the column sql_redo.
START -- This is also "Internal" operation, you will get line "set transaction read write;" in sql_redo INTERNAL -- Dictionary updates
SELECT_FOR_UPDATE - This is also an Internal operation and oracle generates the redo information for "select" statements which has "for update" clause.

In general INTERNAL operations are not relevant, so to query the relevant data, use "seg_owner=' in the "where" clause.

Examples:

How to extract relevant information from the view v$logmnr_contents?

This SQL lists operations performed by user SCOTT

SQL> select distinct operation,username,seg_owner from v$logmnr_contents where seg_owner='SCOTT';

OPERATION USERNAME SEG_OWNER

DDL SCOTT SCOTT

DELETE SCOTT SCOTT

INSERT SCOTT SCOTT

UPDATE SCOTT SCOTT

This SQL lists the undo and redo associated with operations that user SCOTT performed

SQL> select seg_owner,operation,sql_redo,sql_undo from v$logmnr_contents where SEG_owner='SCOTT';

SCOTT DDL

create table LM1 (c1 number, c2 varchar2(10));

SCOTT INSERT

insert into "SCOTT"."LM1"("C1","C2") values ('101','AAAA');

delete from "SCOTT"."LM1" where "C1" = '101' and "C2" = 'AAAA'

and ROWID = 'AAAHfBAABAAAMUqAAA';

SCOTT UPDATE update "SCOTT"."LM1" set "C2" = 'YYY'

where "C2" = 'EEE' and ROWID = 'AAAHfBAABAAAMUqAAE';

update "SCOTT"."LM1" set "C2" = 'EEE' where "C2" = 'YYY'

and ROWID = 'AAAHfBAABAAAMUqAAE';

INSERT / UPDATE / DELETE -- Operations are performed on SYS objects are also considered as an Internal Operations.

This SQL lists undo and redo generated for UPDATE statements issues by user SCOTT

SQL> select username, seg_owner,operation,sql_redo,sql_undo from v$logmnr_contents where operation ='UPDATE' and USERNAME='SCOTT';

UNAME SEG_OW OPERATION SQL_REDO SQL_UNDO

SCOTT SYS UPDATE update "SYS"."OBJ$" set "OBJ#" = '1'..... update ....

SCOTT SYS UPDATE update "SYS"."TSQ$" set "GRANTO..... update .......

SCOTT SYS UPDATE update "SYS"."SEG$" set "TYPE#" = '5'.. update......

As per above result user SCOTT has updated SYS objects so, if you query on USERNAME, you may get incorrect result. So, better to query v$logmnr_contents on SEG_OWNER.

Identifying Operation Counts

Run the following query to see the OPERATION code row count from v$logmnr_contents, to understand which OPERATION code has generated lots of redo information.

SQL> select operation,count(*) from v$logmnr_contents group by operation;

OPERATION COUNT(*)

COMMIT 22236

DDL 2

DELETE 1

INSERT 11

INTERNAL 11

SELECT_FOR_UPDATE 32487

START 22236

UPDATE 480

8 rows selected

Identifying User Counts

Run the following query to check user activity and operation counts:

SQL> select seg_owner,operation,count(*) from v$logmnr_contents group by seg_owner,operation;

SEG_OWNER OPERATION COUNT(*)

SCOTT COMMIT 22236

SCOTT DDL 2

SCOTT DELETE 1

...

BILLY COMMIT 12899

BILLY DDL 5

BILLY DELETE 2

...

NOTE:

Be aware of next known issue:

If you are not using "select for update" statements often in your application and yet find a high operation count for operation code "SELECT_FOR_UPDATE" then you might be hitting a known issue.

To confirm this check whether SQL_REDO shows select,update statements on AQ $_QUEUE_TABLE_AFFINITIES and AQ$ _QUEUE_TABLES.

If you see these selects and updates, then check the value of the Init.ora parameter AQ_TM_PROCESSES. The default value is AQ_TM_PROCESSES = 0 meaning that the queue monitor is not created.

If you are not using Advanced Queuing, then set AQ_TM_PROCESSES back to zero to avoid lots of redo generation on objects AQ $_QUEUE_TABLE_AFFINITIES and AQ$ _QUEUE_TABLES.

How to find sessions generating lots of redo

fact: Oracle Server - Enterprise Edition 8
fact: Oracle Server - Enterprise Edition 9
fact: Oracle Server - Enterprise Edition 10

SOLUTION

To find sessions generating lots of redo, you can use either of the following methods. Both methods examine the amount of undo generated. When a transaction

generates undo, it will automatically generate redo as well.

The methods are:

Query V$SESS_IO. This view contains the column BLOCK_CHANGES which indicates how much blocks have been changed by the session. High values indicate a

session generating lots of redo.

The query you can use is:

SQL> SELECT s.sid, s.serial#, s.username, s.program,

2 i.block_changes

3 FROM v $session s, v$ sess_io i

4 WHERE s.sid = i.sid

5 ORDER BY 5 desc, 1, 2, 3, 4;

Run the query multiple times and examine the delta between each occurrence of BLOCK_CHANGES. Large deltas indicate high redo generation by the session.

Query V$TRANSACTION. This view contains information about the amount of undo blocks and undo records accessed by the transaction (as found in the

USED_UBLK and USED_UREC columns).

The query you can use is:

SQL> SELECT s.sid, s.serial#, s.username, s.program,

2 t.used_ublk, t.used_urec

3 FROM v $session s, v$ transaction t

4 WHERE s.taddr = t.addr

5 ORDER BY 5 desc, 6 desc, 1, 2, 3, 4;

Run the query multiple times and examine the delta between each occurrence of USED_UBLK and USED_UREC. Large deltas indicate high redo generation by

the session.

You use the first query when you need to check for programs generating lots of redo when these programs activate more than one transaction. The latter query

can be used to find out which particular transactions are generating redo.