ORACLE ODAX9-2的一个误告警Affects: /SYS/MB的分析处理

在运维的多套ORACLE ODAX9-2版本,都遇到了一个计算节点的告警:Description: The service Processor poweron selftest has deteced a problem. Probabity;:100, UulD:cd1ebbdf-f099-61de-ca44-ef646defe034, Resource:/SYS/MB,;此告警从描述上来看比较验证,但是事实是主机运行正常,对此告警进行分析认为就误报,ORACLE ODA的硬件管理平台ILOM上提供了清理告警的接口,按如下步骤进行清理后,告警消除,后续持续观察,系统运行正常。

处理步骤如下:

1、查看告警信息:

点击查看告警详情:

2、命令行接口查看告警信息(序列号已经脱敏请勿对比)

[root@aaadb1 ~]# ipmitool sunoem cli

Connected. Use ^D to exit.

-> start /SP/faultmgmt/shell

Are you sure you want to start /SP/faultmgmt/shell (y/n)? y

faultmgmtsp> fmadm faulty


Time UUID msgid Severity


2023-07-31/08:55:39 cd1ebbdf-f099-61de-ca44-ef646defe034 ILOM-8000-4T Critical

Problem Status : open

Diag Engine : fdd 1.0

System

Manufacturer : Oracle Corporation

Name : ORACLE SERVER X9-2L

Part_Number : 7603

Serial_Number : 23000

System Component

Firmware_Manufacturer : Oracle Corporation

Firmware_Version : (ILOM)5.1.0.23 r147470,(BIOS)62070300

Firmware_Release : (ILOM)2022.09.03,(BIOS)2022.08.17


Suspect 1 of 1

Problem class : fault.chassis.device.sppost

Certainty : 100%

Affects : /SYS/MB

Status : faulted

FRU

Status : faulty

Location : /SYS/MB

Manufacturer : Oracle Corporation

Name : ASM,MTHRBD,2U

Part_Number : 820000

Revision : 12

Serial_Number : 4650000

Chassis

Manufacturer : Oracle Corporation

Name : ORACLE SERVER X9-2L

Part_Number : 76000000

Serial_Number : 2300000

Description : The Service Processor power-on self test has detected a

problem.

Response : The service-required LED may be illuminated on the affected

FRU and chassis.

Impact : The Service Processor may not be able to perform necessary

functions to power on, monitor, or manage the system.

Action : Please refer to the associated reference document at

http://support.oracle.com/msg/ILOM-8000-4T for the latest

service procedures and policies regarding this diagnosis.

3、清理告警

清理的步骤参考PSH Procedural Article for ILOM-Based Diagnosis (Doc ID 1155200.1),使用'fmadm repair' 命令即可

  • Enter the fault management shell.

-> start /SP/faultmgmt/shell

Are you sure you want to start /SP/faultmgmt/shell (y/n) ? y

faultmgmtsp>

  • Use 'fmadm repair' to clear the fault.

Rather than the UUID, the FRU path (/SYS/FANBD/FM0) could also be used.

Example 3
Example 3 shows the 'fmadm repaired' command required after the suspect FRU has been replaced. Using the UUID from the 'fmadm faulty from Example 1 above, the command would be:

faultmgmtsp> fmadm repair 9df39f93-f356-6d26-e081-e4f3a9872c2f

Example 4

Example 4 shows the 'fmadm repaired' command required after the FRU has been replaced.. This example shows the FRU Path from Example 2 above being used. The command would be:

fmadm repair /SYS/MB

具体处理日志如下:(根据告警事件的UUID)

faultmgmtsp> fmadm repair cd1ebbdf-f099-61de-ca44-ef646defe034

faultmgmtsp> fmadm faulty

No faults found

faultmgmtsp> exit

-> exit

Disconnected

相关推荐
哭哭啼26 分钟前
Redis环境部署(主从模式、哨兵模式、集群模式)
数据库·redis·缓存
咕噜Yuki060940 分钟前
OCP证书如何下载?
数据库·ocp·证书查询
冬瓜3121 小时前
linux-c 使用c语言操作sqlite3数据库-1
数据库·sqlite
夜色呦1 小时前
现代电商解决方案:Spring Boot框架实践
数据库·spring boot·后端
WangYaolove13142 小时前
请解释Python中的装饰器是什么?如何使用它们?
linux·数据库·python
我是黄大仙2 小时前
利用飞书多维表格自动发布版本
运维·服务器·数据库·飞书
曾经的三心草2 小时前
Mysql之约束与事件
android·数据库·mysql·事件·约束
WuMingf_2 小时前
redis
数据库·redis
张某布响丸辣2 小时前
SQL中的时间类型:深入解析与应用
java·数据库·sql·mysql·oracle