Oracle Redo日志损坏挽救详细攻略

一 介绍

1.1 介绍

Oracle Redo损坏分四种情况:unused状态日志损坏 inactive状态日志损坏 active状态日志损坏 current状态日志损坏。针对不同状态的日志损坏,处理方式有所不同,下面将逐一介绍。

二 恢复

2.1 unused与inactive状态日志损坏

如果这个日志是inactive,手动执行clearing操作:

复制代码
SQL> alter database clear logfile group 2;
alter database clear logfile group 2
*
第 1 行出现错误:
ORA-00350: 日志 2 (实例 orcl 的日志, 线程 1) 需要归档
ORA-00312: 联机日志 2 线程 1:
F:ORACLEPRODUCT10.2.0ORADATAORCLREDO02.LOG

执行如下操作:

复制代码
SQL> alter database clear unarchived logfile group 2;

数据库已更改。

2.2 active状态日志损坏

存在归档直接使用归档恢复即可.

复制代码
SYS@orcl11g>recover database until cancel; --指定恢复的时间点(如果不知道,就是untill cancel)
ORA-00279: change 1763218 generated at 06/24/2021 12:02:00 needed for thread 1
ORA-00289: suggestion : /u01/app/oracle/arch/1_74_816622368.dbf
ORA-00280: change 1763218 for thread 1 is in sequence #74
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/app/oracle/arch/1_74_816622368.dbf
ORA-00279: change 1769094 generated at 06/24/2021 13:34:43 needed for thread 1
ORA-00289: suggestion : /u01/app/oracle/arch/1_75_816622368.dbf
ORA-00280: change 1769094 for thread 1 is in sequence #75
ORA-00278: log file '/u01/app/oracle/arch/1_74_816622368.dbf' no longer needed for this recovery
 
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/app/oracle/oradata/orcl11g/redo01.log --指定current日志
Log applied.
Media recovery complete.

2.3 Current状态日志损坏

常规情况:

设置隐藏参数:

复制代码
alter system set "_allow_resetlogs_corruption"=true scope=spfile;

SYS@orcl11g> recover database until cancel;
ORA-00279: change 1789650 generated at 06/24/2021 13:40:21 needed for thread 1
ORA-00289: suggestion : /u01/app/oracle/arch/1_2_818948248.dbf
ORA-00280: change 1789650 for thread 1 is in sequence #2
 
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/app/oracle/arch/1_2_818948248.dbf
ORA-00279: change 1789904 generated at 06/24/2021 13:41:02 needed for thread 1
ORA-00289: suggestion : /u01/app/oracle/arch/1_3_818948248.dbf
ORA-00280: change 1789904 for thread 1 is in sequence #3
ORA-00278: log file '/u01/app/oracle/arch/1_2_818948248.dbf' no longer needed
for this recovery
 
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/app/oracle/oradata/orcl11g/system01.dbf'
 
SYS@orcl11g> alter database open resetlogs;
Database altered.

如若出现与SCN相关 ORA-00600错误使用以下推进SCN方式进行处理

2.3.1 Poke推进scn修复

1.查看当前数据库的Current SCN

复制代码
SYS@orcl> select current_scn||'' from v$database;
CURRENT_SCN||''
--------------------------------------------------------------------------------
4563483988

可以看到当前SCN是4563483988,我现在想推进SCN,在10w级别,也就是4563483988标红数字修改为指定值。

2.重新启动数据库到mount阶段

复制代码
SYS@orcl> shutdown abort
ORACLE instance shut down.
SYS@orcl> startup mount
ORACLE instance started.
Total System Global Area 1235959808 bytes
Fixed Size                  2252784 bytes
Variable Size             788529168 bytes
Database Buffers          436207616 bytes
Redo Buffers                8970240 bytes
Database mounted.

3.使用oradebug poke推进SCN

我这里直接把十万位的"4"改为"9"了,相当于推进了50w左右: 说明:实验发现oradebug poke 推进的SCN值,既可以指定十六进制的0x11008DE74,也可以直接指定十进制的4563983988。

复制代码
SYS@orcl> oradebug setmypid
Statement processed.

SYS@orcl> oradebug dumpvar sga kcsgscn_
kcslf kcsgscn_ [06001AE70, 06001AEA0) = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 6001AB50 00000000

SYS@orcl> select to_char(checkpoint_change#, 'XXXXXXXXXXXXXXXX') from v$database;
TO_CHAR(CHECKPOINT_CHANGE#,'XXXXXX
----------------------------------
        110013C41

SYS@orcl> oradebug poke 0x06001AE70 8 4563983988
BEFORE: [06001AE70, 06001AE78) = 00000000 00000000
AFTER:  [06001AE70, 06001AE78) = 1008DE74 00000001

SYS@orcl> oradebug dumpvar sga kcsgscn_
kcslf kcsgscn_ [06001AE70, 06001AEA0) = 1008DE74 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 6001AB50 00000000

SYS@orcl> alter database open;
Database altered.

SYS@orcl> select current_scn||'' from v$database;
CURRENT_SCN||''
--------------------------------------------------------------------------------
4563984271

可以看到已经成功将SCN推进到4563983988,SCN不断增长,所以这里查到的值略大一些。

4.举例ORA-600[2662]错误下poke计算方式

复制代码
A data block SCN is ahead of the current SCN.
The ORA-600 [2662] occurs when an SCN is compared to the dependent SCN  stored in a UGA variable.
If the SCN is less than the dependent SCN then we signal the ORA-600 [2662] internal error.
 
ARGUMENTS:
  Arg [a]  Current SCN WRAP
  Arg [b]  Current SCN BASE
  Arg [c]  dependent SCN WRAP
  Arg [d]  dependent SCN BASE 
  Arg [e]  Where present this is the DBA where the dependent SCN came from.

计算方式:

复制代码
ORA-00600: internal error code, arguments: [2662], [2], [1424107441], [2], [1424142235], [8388617], [], []
select 2*power(2,32)+1424142235 from dual;
10014076827
ORA-00600: internal error code, arguments: [2662], [2], [1424142249], [2], [1424142302], [8388649], [], []
select 2*power(2,32)+1424143000 from dual;
10014077592

总结公式:c * power(2,32) + d {+ 可适当加一点,但不要太大!}

c代表:Arg [c] dependent SCN WRAP

d代表:Arg [d] dependent SCN BASE

2.3.2 12c event 21307096推进scn修复

1.计算方式

复制代码
Lowest_scn+event  level * 1000000

查看当前数据库SCN:

复制代码
SQL> select to_char(current_scn) from v$database;

TO_CHAR(CURRENT_SCN)
----------------------------------------
12796139551520

2.添加event以及参数

复制代码
alter system set "_allow_resetlogs_corruption"=true scope=spfile;
alter system set event='21307096 trace name context forever,level 3' scope=spfile;

3.启动数据库

复制代码
SQL> shutdown immediate;
Database dismounted.
ORACLE instance shut down.

SQL> startup mount;
ORACLE instance started.

Total System Global Area 1660944384 bytes
Fixed Size                  8793448 bytes
Variable Size             889193112 bytes
Database Buffers          754974720 bytes
Redo Buffers                7983104 bytes
Database mounted.

SQL> recover database using backup controlfile until cancel;
ORA-00279: change 12796139551734 generated at 04/20/2022 11:13:44 needed for
thread 1
ORA-00289: suggestion :
/app/oracle/product/12.2.0/db_1/dbs/arch1_1_1102504135.dbf
ORA-00280: change 12796139551734 for thread 1 is in sequence #1


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SQL> 
SQL> 
SQL> alter database open resetlogs;

Database altered.

SQL> select to_char(current_scn) from v$database;

TO_CHAR(CURRENT_SCN)
----------------------------------------
12796142552279

SCN成功推进300w

2.3.3 gdb推进scn修复

Session 1:

复制代码
查询当前scn:
SQL> select current_scn from v$database;                
CURRENT_SCN
-----------
 2910718245

查询当前SCN转成16进制后的值:
SQL> select to_char(2910718245,'xxxxxxxxxxxx') from dual;
TO_CHAR(29107
-------------
     ad7e0925

查询预修改的SCN转换成16进制后的值,本次将最高位增加一位数
SQL> select to_char(3910718245,'xxxxxxxxxxxx') from dual; 
TO_CHAR(39107
-------------
     e918d325

SQL> oradebug setmypid
Statement processed.

SQL> oradebug dumpvar sga kcsgscn_
kscn8 kcsgscn_ [060017E98, 060017EA0) = AD7E093B 00000000

需要注意的是,060017E98是SCN BASE值,AD7E093B是当前的SCN值,可以理解为060017E98是一个代号x,当前的x等于AD7E093B,待会儿我们修改SCN值的时候,就会需要指定060017E98这个值等于多少。

Session 2:

[oracle@redhat19c11 复制代码
oracle    9824  9730  0 Feb22 ?        00:00:01 oracleorcl (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle   18621  8636  0 01:18 pts/1    00:00:00 grep --color=auto LOCAL=YES
oracle   20109 20105  0 Feb15 ?        00:00:13 oracletestdb19c (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))





本次测试库是orcl,因此选9824
[oracle@redhat19c11 ~]$ gdb $ORACLE_HOME/bin/oracle 9824
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
-------------------------------------------
-------------------------------------------
(gdb) set *((int *) 0x060017E98) = 0xe918d32--->将SCN BASE修改为刚才查出来的值
(gdb) quit
A debugging session is active.
        Inferior 1 [process 9824] will be detached.
Quit anyway? (y or n) y
Detaching from program: /oracle/app/product/19.3.0/db_1/bin/oracle, process 9824

返回session1查询,修改成功:

复制代码
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
 3910718287

重启数据库,也可正常打开数据库

复制代码
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.

Total System Global Area 2466250400 bytes
Fixed Size                  9137824 bytes
Variable Size             603979776 bytes
Database Buffers         1845493760 bytes
Redo Buffers                7639040 bytes
Database mounted.
Database opened.
SQL> select current_scn from v$database;

CURRENT_SCN
-----------
 3910719415

总结

Oracle Redo 日志损坏的恢复方法取决于日志的状态。对于 Unused 和 Inactive 状态的日志,通常可以直接清除;Active 状态的日志需要结合归档日志进行恢复;而 Current 状态的日志损坏最为严重,可能需要基于最新的备份进行完整恢复。合理配置日志管理策略,定期备份数据库,并妥善处理归档日志,可以有效降低因日志损坏导致的数据丢失风险。

相关推荐
wangbing11255 分钟前
MySQL 官方 GPG 密钥过期问题
数据库·mysql
PaperData9 分钟前
2000-2023年地级市数字基础设施评价指标体系
大数据·网络·数据库·人工智能·数据分析·经管
重生之我是Java开发战士10 分钟前
【MySQL】事务 & 用户与权限管理
android·数据库·mysql
琢磨先生David39 分钟前
电信行业数据库开发的一些经验
数据库·数据库开发
key_3_feng42 分钟前
数据库Skill开发教程:从零构建SQLite应用
数据库·sqlite·skill
2301_812539671 小时前
Golang怎么实现网页爬虫抓取数据_Golang如何用colly框架快速构建爬虫采集程序【教程】
jvm·数据库·python
雪碧聊技术1 小时前
组合查询(union)
数据库·sql
杨云龙UP1 小时前
ODA运维实战:Oracle 19c YJXT PDB表空间在线扩容全过程_20260503
linux·运维·服务器·数据库·oracle
BENA ceic1 小时前
Spring 的三种注入方式?
java·数据库·spring
2401_895521341 小时前
MySQL中的count函数
数据库·mysql