ORACLE ODAX9-2的一个误告警Affects: /SYS/MB的分析处理

在运维的多套ORACLE ODAX9-2版本,都遇到了一个计算节点的告警:Description: The service Processor poweron selftest has deteced a problem. Probabity;:100, UulD:cd1ebbdf-f099-61de-ca44-ef646defe034, Resource:/SYS/MB,;此告警从描述上来看比较验证,但是事实是主机运行正常,对此告警进行分析认为就误报,ORACLE ODA的硬件管理平台ILOM上提供了清理告警的接口,按如下步骤进行清理后,告警消除,后续持续观察,系统运行正常。

处理步骤如下:

1、查看告警信息:

点击查看告警详情:

2、命令行接口查看告警信息(序列号已经脱敏请勿对比)

root@aaadb1 \~\]# ipmitool sunoem cli Connected. Use \^D to exit. -\> start /SP/faultmgmt/shell Are you sure you want to start /SP/faultmgmt/shell (y/n)? y faultmgmtsp\> fmadm faulty ------------------- ------------------------------------ -------------- -------- Time UUID msgid Severity ------------------- ------------------------------------ -------------- -------- 2023-07-31/08:55:39 cd1ebbdf-f099-61de-ca44-ef646defe034 ILOM-8000-4T Critical Problem Status : open Diag Engine : fdd 1.0 System Manufacturer : Oracle Corporation Name : ORACLE SERVER X9-2L Part_Number : 7603 Serial_Number : 23000 System Component Firmware_Manufacturer : Oracle Corporation Firmware_Version : (ILOM)5.1.0.23 r147470,(BIOS)62070300 Firmware_Release : (ILOM)2022.09.03,(BIOS)2022.08.17 ---------------------------------------- Suspect 1 of 1 Problem class : fault.chassis.device.sppost Certainty : 100% Affects : /SYS/MB Status : faulted FRU Status : faulty Location : /SYS/MB Manufacturer : Oracle Corporation Name : ASM,MTHRBD,2U Part_Number : 820000 Revision : 12 Serial_Number : 4650000 Chassis Manufacturer : Oracle Corporation Name : ORACLE SERVER X9-2L Part_Number : 76000000 Serial_Number : 2300000 Description : The Service Processor power-on self test has detected a problem. Response : The service-required LED may be illuminated on the affected FRU and chassis. Impact : The Service Processor may not be able to perform necessary functions to power on, monitor, or manage the system. Action : Please refer to the associated reference document at http://support.oracle.com/msg/ILOM-8000-4T for the latest service procedures and policies regarding this diagnosis. 3、清理告警 清理的步骤参考PSH Procedural Article for ILOM-Based Diagnosis (Doc ID 1155200.1),使用'fmadm repair' 命令即可 > > * Enter the fault management shell. > > -\> start /SP/faultmgmt/shell > > Are you sure you want to start /SP/faultmgmt/shell (y/n) ? y > > > faultmgmtsp\> > > * Use 'fmadm repair' to clear the fault. > > Rather than the UUID, the FRU path (/SYS/FANBD/FM0) could also be used. > > *Example* *3* > Example 3 shows the 'fmadm repaired' command required after the suspect FRU has been replaced. Using the UUID from the 'fmadm faulty from Example 1 above, the command would be: > > faultmgmtsp\> fmadm repair 9df39f93-f356-6d26-e081-e4f3a9872c2f > > *Example 4* > > Example 4 shows the 'fmadm repaired' command required after the FRU has been replaced.. This example shows the FRU Path from Example 2 above being used. The command would be: > > fmadm repair /SYS/MB > > 具体处理日志如下:(根据告警事件的UUID) > > faultmgmtsp\> fmadm repair cd1ebbdf-f099-61de-ca44-ef646defe034 > > faultmgmtsp\> fmadm faulty > > No faults found > > faultmgmtsp\> exit > > -\> exit > > Disconnected

相关推荐
数据智能老司机12 小时前
CockroachDB权威指南——CockroachDB SQL
数据库·分布式·架构
数据智能老司机13 小时前
CockroachDB权威指南——开始使用
数据库·分布式·架构
松果猿13 小时前
空间数据库学习(二)—— PostgreSQL数据库的备份转储和导入恢复
数据库
无名之逆13 小时前
Rust 开发提效神器:lombok-macros 宏库
服务器·开发语言·前端·数据库·后端·python·rust
s91236010113 小时前
rust 同时处理多个异步任务
java·数据库·rust
数据智能老司机13 小时前
CockroachDB权威指南——CockroachDB 架构
数据库·分布式·架构
hzulwy14 小时前
Redis常用的数据结构及其使用场景
数据库·redis
程序猿熊跃晖14 小时前
解决 MyBatis-Plus 中 `update.setProcInsId(null)` 不生效的问题
数据库·tomcat·mybatis
Three~stone15 小时前
MySQL学习集--DDL
数据库·sql·学习
Qi妙代码15 小时前
MYSQL基础
数据库·mysql·oracle