ORACLE ODAX9-2的一个误告警Affects: /SYS/MB的分析处理

在运维的多套ORACLE ODAX9-2版本,都遇到了一个计算节点的告警:Description: The service Processor poweron selftest has deteced a problem. Probabity;:100, UulD:cd1ebbdf-f099-61de-ca44-ef646defe034, Resource:/SYS/MB,;此告警从描述上来看比较验证,但是事实是主机运行正常,对此告警进行分析认为就误报,ORACLE ODA的硬件管理平台ILOM上提供了清理告警的接口,按如下步骤进行清理后,告警消除,后续持续观察,系统运行正常。

处理步骤如下:

1、查看告警信息:

点击查看告警详情:

2、命令行接口查看告警信息(序列号已经脱敏请勿对比)

root@aaadb1 \~# ipmitool sunoem cli

Connected. Use ^D to exit.

-> start /SP/faultmgmt/shell

Are you sure you want to start /SP/faultmgmt/shell (y/n)? y

faultmgmtsp> fmadm faulty


Time UUID msgid Severity


2023-07-31/08:55:39 cd1ebbdf-f099-61de-ca44-ef646defe034 ILOM-8000-4T Critical

Problem Status : open

Diag Engine : fdd 1.0

System

Manufacturer : Oracle Corporation

Name : ORACLE SERVER X9-2L

Part_Number : 7603

Serial_Number : 23000

System Component

Firmware_Manufacturer : Oracle Corporation

Firmware_Version : (ILOM)5.1.0.23 r147470,(BIOS)62070300

Firmware_Release : (ILOM)2022.09.03,(BIOS)2022.08.17


Suspect 1 of 1

Problem class : fault.chassis.device.sppost

Certainty : 100%

Affects : /SYS/MB

Status : faulted

FRU

Status : faulty

Location : /SYS/MB

Manufacturer : Oracle Corporation

Name : ASM,MTHRBD,2U

Part_Number : 820000

Revision : 12

Serial_Number : 4650000

Chassis

Manufacturer : Oracle Corporation

Name : ORACLE SERVER X9-2L

Part_Number : 76000000

Serial_Number : 2300000

Description : The Service Processor power-on self test has detected a

problem.

Response : The service-required LED may be illuminated on the affected

FRU and chassis.

Impact : The Service Processor may not be able to perform necessary

functions to power on, monitor, or manage the system.

Action : Please refer to the associated reference document at

http://support.oracle.com/msg/ILOM-8000-4T for the latest

service procedures and policies regarding this diagnosis.

3、清理告警

清理的步骤参考PSH Procedural Article for ILOM-Based Diagnosis (Doc ID 1155200.1),使用'fmadm repair' 命令即可

  • Enter the fault management shell.

-> start /SP/faultmgmt/shell

Are you sure you want to start /SP/faultmgmt/shell (y/n) ? y

faultmgmtsp>

  • Use 'fmadm repair' to clear the fault.

Rather than the UUID, the FRU path (/SYS/FANBD/FM0) could also be used.

Example 3
Example 3 shows the 'fmadm repaired' command required after the suspect FRU has been replaced. Using the UUID from the 'fmadm faulty from Example 1 above, the command would be:

faultmgmtsp> fmadm repair 9df39f93-f356-6d26-e081-e4f3a9872c2f

Example 4

Example 4 shows the 'fmadm repaired' command required after the FRU has been replaced.. This example shows the FRU Path from Example 2 above being used. The command would be:

fmadm repair /SYS/MB

具体处理日志如下:(根据告警事件的UUID)

faultmgmtsp> fmadm repair cd1ebbdf-f099-61de-ca44-ef646defe034

faultmgmtsp> fmadm faulty

No faults found

faultmgmtsp> exit

-> exit

Disconnected

相关推荐
这个DBA有点耶1 分钟前
核心系统的高可用与容灾架构:从主从到两地三中心全面解析
java·开发语言·数据库·sql·mysql·架构·运维开发
是店小二呀2 分钟前
零门槛快速接入主流大模型:基于 AI Ping 平台一键集成 GLM-5.1 与多场景应用深度实战
大数据·数据库·人工智能
asdfg12589636 分钟前
BeanListHandler的通俗理解
java·数据库·oracle
KaMeidebaby10 分钟前
卡梅德生物技术快报|羊驼免疫:分子生物学实战:基于羊驼免疫的重链抗体制备与全流程验证方案
前端·网络·数据库·人工智能·算法·百度
jieyucx20 分钟前
数据库专题开篇:零基础迈入 MySQL 的第一步
数据库·mysql
ClouGence24 分钟前
Oracle BLOB 实时同步为什么这么难?一次看懂背后的 5 个技术挑战
数据库·oracle
jnrjian25 分钟前
index skip scan 和oracle partition index 未加分区键类似
数据库
不剪发的Tony老师33 分钟前
SQLQueryStress:一款SQL Server查询压力测试工具
数据库·sqlserver·压力测试
minji...40 分钟前
MySQL数据库 (六) MySQL表的约束(下),自增长约束,唯一键约束,外键约束,索引
数据库·mysql·索引·外键·唯一键·外键约束·自增长约束
赵渝强老师1 小时前
【赵渝强老师】崖山数据库的逻辑存储结构
数据库·oracle