ORACLE ODAX9-2的一个误告警Affects: /SYS/MB的分析处理

在运维的多套ORACLE ODAX9-2版本,都遇到了一个计算节点的告警:Description: The service Processor poweron selftest has deteced a problem. Probabity;:100, UulD:cd1ebbdf-f099-61de-ca44-ef646defe034, Resource:/SYS/MB,;此告警从描述上来看比较验证,但是事实是主机运行正常,对此告警进行分析认为就误报,ORACLE ODA的硬件管理平台ILOM上提供了清理告警的接口,按如下步骤进行清理后,告警消除,后续持续观察,系统运行正常。

处理步骤如下:

1、查看告警信息:

点击查看告警详情:

2、命令行接口查看告警信息(序列号已经脱敏请勿对比)

root@aaadb1 \~\]# ipmitool sunoem cli Connected. Use \^D to exit. -\> start /SP/faultmgmt/shell Are you sure you want to start /SP/faultmgmt/shell (y/n)? y faultmgmtsp\> fmadm faulty ------------------- ------------------------------------ -------------- -------- Time UUID msgid Severity ------------------- ------------------------------------ -------------- -------- 2023-07-31/08:55:39 cd1ebbdf-f099-61de-ca44-ef646defe034 ILOM-8000-4T Critical Problem Status : open Diag Engine : fdd 1.0 System Manufacturer : Oracle Corporation Name : ORACLE SERVER X9-2L Part_Number : 7603 Serial_Number : 23000 System Component Firmware_Manufacturer : Oracle Corporation Firmware_Version : (ILOM)5.1.0.23 r147470,(BIOS)62070300 Firmware_Release : (ILOM)2022.09.03,(BIOS)2022.08.17 ---------------------------------------- Suspect 1 of 1 Problem class : fault.chassis.device.sppost Certainty : 100% Affects : /SYS/MB Status : faulted FRU Status : faulty Location : /SYS/MB Manufacturer : Oracle Corporation Name : ASM,MTHRBD,2U Part_Number : 820000 Revision : 12 Serial_Number : 4650000 Chassis Manufacturer : Oracle Corporation Name : ORACLE SERVER X9-2L Part_Number : 76000000 Serial_Number : 2300000 Description : The Service Processor power-on self test has detected a problem. Response : The service-required LED may be illuminated on the affected FRU and chassis. Impact : The Service Processor may not be able to perform necessary functions to power on, monitor, or manage the system. Action : Please refer to the associated reference document at http://support.oracle.com/msg/ILOM-8000-4T for the latest service procedures and policies regarding this diagnosis. 3、清理告警 清理的步骤参考PSH Procedural Article for ILOM-Based Diagnosis (Doc ID 1155200.1),使用'fmadm repair' 命令即可 > > * Enter the fault management shell. > > -\> start /SP/faultmgmt/shell > > Are you sure you want to start /SP/faultmgmt/shell (y/n) ? y > > > faultmgmtsp\> > > * Use 'fmadm repair' to clear the fault. > > Rather than the UUID, the FRU path (/SYS/FANBD/FM0) could also be used. > > *Example* *3* > Example 3 shows the 'fmadm repaired' command required after the suspect FRU has been replaced. Using the UUID from the 'fmadm faulty from Example 1 above, the command would be: > > faultmgmtsp\> fmadm repair 9df39f93-f356-6d26-e081-e4f3a9872c2f > > *Example 4* > > Example 4 shows the 'fmadm repaired' command required after the FRU has been replaced.. This example shows the FRU Path from Example 2 above being used. The command would be: > > fmadm repair /SYS/MB > > 具体处理日志如下:(根据告警事件的UUID) > > faultmgmtsp\> fmadm repair cd1ebbdf-f099-61de-ca44-ef646defe034 > > faultmgmtsp\> fmadm faulty > > No faults found > > faultmgmtsp\> exit > > -\> exit > > Disconnected

相关推荐
WeiQ_1 天前
解决phpstudy 8.x软件中php8.2.9没有redis扩展的问题
数据库·redis·缓存
DashVector1 天前
向量检索服务 DashVector产品计费
数据库·数据仓库·人工智能·算法·向量检索
KYGALYX1 天前
在Linux中备份msyql数据库和表的详细操作
linux·运维·数据库
檀越剑指大厂1 天前
金仓KReplay:定义数据库平滑迁移新标准
数据库
努力成为一个程序猿.1 天前
【Flink】FlinkSQL-动态表和持续查询概念
大数据·数据库·flink
毕设十刻1 天前
基于Vue的学分预警系统98k51(程序 + 源码 + 数据库 + 调试部署 + 开发环境配置),配套论文文档字数达万字以上,文末可获取,系统界面展示置于文末
前端·数据库·vue.js
liliangcsdn1 天前
如何利用约束提示优化LLM在问题转sql的一致性
数据库·sql
Java爱好狂.1 天前
分布式ID|从源码角度深度解析美团Leaf双Buffer优化方案
java·数据库·分布式·分布式id·es·java面试·java程序员
Elastic 中国社区官方博客1 天前
通过混合搜索重排序提升多语言嵌入模型的相关性
大数据·数据库·人工智能·elasticsearch·搜索引擎·ai·全文检索
倔强的石头1061 天前
KingbaseES:从兼容到超越,详解超越MySQL的权限隔离与安全增强
数据库·mysql·安全·金仓数据库