ORACLE ODAX9-2的一个误告警Affects: /SYS/MB的分析处理

在运维的多套ORACLE ODAX9-2版本,都遇到了一个计算节点的告警:Description: The service Processor poweron selftest has deteced a problem. Probabity;:100, UulD:cd1ebbdf-f099-61de-ca44-ef646defe034, Resource:/SYS/MB,;此告警从描述上来看比较验证,但是事实是主机运行正常,对此告警进行分析认为就误报,ORACLE ODA的硬件管理平台ILOM上提供了清理告警的接口,按如下步骤进行清理后,告警消除,后续持续观察,系统运行正常。

处理步骤如下:

1、查看告警信息:

点击查看告警详情:

2、命令行接口查看告警信息(序列号已经脱敏请勿对比)

[root@aaadb1 ~]# ipmitool sunoem cli

Connected. Use ^D to exit.

-> start /SP/faultmgmt/shell

Are you sure you want to start /SP/faultmgmt/shell (y/n)? y

faultmgmtsp> fmadm faulty


Time UUID msgid Severity


2023-07-31/08:55:39 cd1ebbdf-f099-61de-ca44-ef646defe034 ILOM-8000-4T Critical

Problem Status : open

Diag Engine : fdd 1.0

System

Manufacturer : Oracle Corporation

Name : ORACLE SERVER X9-2L

Part_Number : 7603

Serial_Number : 23000

System Component

Firmware_Manufacturer : Oracle Corporation

Firmware_Version : (ILOM)5.1.0.23 r147470,(BIOS)62070300

Firmware_Release : (ILOM)2022.09.03,(BIOS)2022.08.17


Suspect 1 of 1

Problem class : fault.chassis.device.sppost

Certainty : 100%

Affects : /SYS/MB

Status : faulted

FRU

Status : faulty

Location : /SYS/MB

Manufacturer : Oracle Corporation

Name : ASM,MTHRBD,2U

Part_Number : 820000

Revision : 12

Serial_Number : 4650000

Chassis

Manufacturer : Oracle Corporation

Name : ORACLE SERVER X9-2L

Part_Number : 76000000

Serial_Number : 2300000

Description : The Service Processor power-on self test has detected a

problem.

Response : The service-required LED may be illuminated on the affected

FRU and chassis.

Impact : The Service Processor may not be able to perform necessary

functions to power on, monitor, or manage the system.

Action : Please refer to the associated reference document at

http://support.oracle.com/msg/ILOM-8000-4T for the latest

service procedures and policies regarding this diagnosis.

3、清理告警

清理的步骤参考PSH Procedural Article for ILOM-Based Diagnosis (Doc ID 1155200.1),使用'fmadm repair' 命令即可

  • Enter the fault management shell.

-> start /SP/faultmgmt/shell

Are you sure you want to start /SP/faultmgmt/shell (y/n) ? y

faultmgmtsp>

  • Use 'fmadm repair' to clear the fault.

Rather than the UUID, the FRU path (/SYS/FANBD/FM0) could also be used.

Example 3
Example 3 shows the 'fmadm repaired' command required after the suspect FRU has been replaced. Using the UUID from the 'fmadm faulty from Example 1 above, the command would be:

faultmgmtsp> fmadm repair 9df39f93-f356-6d26-e081-e4f3a9872c2f

Example 4

Example 4 shows the 'fmadm repaired' command required after the FRU has been replaced.. This example shows the FRU Path from Example 2 above being used. The command would be:

fmadm repair /SYS/MB

具体处理日志如下:(根据告警事件的UUID)

faultmgmtsp> fmadm repair cd1ebbdf-f099-61de-ca44-ef646defe034

faultmgmtsp> fmadm faulty

No faults found

faultmgmtsp> exit

-> exit

Disconnected

相关推荐
vvvae12342 小时前
分布式数据库
数据库
雪域迷影2 小时前
PostgreSQL Docker Error – 5432: 地址已被占用
数据库·docker·postgresql
bug菌¹3 小时前
滚雪球学Oracle[4.2讲]:PL/SQL基础语法
数据库·oracle
逸巽散人3 小时前
SQL基础教程
数据库·sql·oracle
月空MoonSky3 小时前
Oracle中TRUNC()函数详解
数据库·sql·oracle
momo小菜pa3 小时前
【MySQL 06】表的增删查改
数据库·mysql
向上的车轮4 小时前
Django学习笔记二:数据库操作详解
数据库·django
编程老船长4 小时前
第26章 Java操作Mongodb实现数据持久化
数据库·后端·mongodb
全栈师5 小时前
SQL Server中关于个性化需求批量删除表的做法
数据库·oracle
Data 3175 小时前
Hive数仓操作(十七)
大数据·数据库·数据仓库·hive·hadoop