【银河麒麟高级服务器操作系统】服务器外挂存储ioerror分析及处理分享

更多银河麒麟操作系统产品及技术讨论,欢迎加入银河麒麟操作系统官方论坛

forum.kylinos.cn

了解更多银河麒麟操作系统全新产品,请点击访问

麒麟软件产品专区:product.kylinos.cn

开发者专区:developer.kylinos.cn

文档中心:document.kylinos.cn

服务器环境以及配置

|------|--------------|---------------------------------------------------------------|
| 系统环境 | 物理机/虚拟机/云/容器 | 物理机 |
| 网络环境 | 外网/私有网络/无网络 | 私有网络 |
| 硬件环境 | 处理器: | S2500 |
| 硬件环境 | 内存: | 512GB |
| 硬件环境 | 机器型号 | 擎天EF860 |
| 硬件环境 | 整机类型/架构: | arm64 |
| 硬件环境 | BIOS版本: | Great Wall BIOS V3.0 |
| 软件环境 | 具体操作系统版本 | 银河麒麟高级服务器操作系统 Kylin Linux Advanced Server release V10 (Lance) |
| 软件环境 | 内核版本 | 4.19.90-52.30.v2207.ky10.aarch64 |

现象描述

服务器巡检告显示有io error,需要进行分析。

现象分析

查看磁盘存储情况

根据串口日志,报IO Error错误的是dm-4和dm-5磁盘设备,对应的是/dev/mpathxsky02blk01和/dev/mpathxsky02blk02两块多路径盘。

分析串口日志

查看串口日志,系统出现过三次I/O error相关的报错,第一次导致系统发生了hung task,后面两次出现IO error报错后便出现了shutdown相关的日志打印。

分析第一次出现 I/O error

日志中多次出现print_req_error: I/O error, dev sdb和print_req_error: I/O error, dev sdc的错误。这表明设备sdb和sdc发生了I/O错误。另外,日志中还出现了rejecting I/O to offline device的消息(例如:sd 3:0:0:1: rejecting I/O to offline device),这通常意味着设备已经离线,无法再进行I/O操作。这些I/O错误可能是由于硬盘故障、连接问题(例如,SATA线缆故障)或控制器问题引起的。

日志中有多个任务被报告为挂起超过1200秒,如xfsaild/dm-4和containerd。这些任务的挂起是由于无法完成的磁盘I/O请求导致的,因为设备已经离线或不可用。出现了任务挂起(通常和无法访问存储设备相关),导致最后内核触发了恐慌(Kernel panic - not syncing: hung_task: blocked tasks)。

sdb和sdc设备应该对应sd 3:0:0:0和sd 3:0:0:1,查看当前收集的sosreport中的lsscsi命令,由于相隔的时间太过久远,没有3:0:0:0和3:0:0:1相关的设备,变为了5:0:0:0和5:0:0:1。后续也未再出现过rejecting I/O to offline device,问题应该已经修复。

|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [30551.689416][ 86] print_req_error: I/O error, dev sdb, sector 278120^M [30551.690643][ 86] print_req_error: I/O error, dev sdb, sector 526730880^M [30551.691874][ 86] print_req_error: I/O error, dev sdb, sector 794630784^M [30551.693092][ 86] print_req_error: I/O error, dev sdb, sector 3239552^M [30601.823314][ 24] print_req_error: I/O error, dev sdb, sector 267177832^M [30622.174613][ 23] print_req_error: I/O error, dev sdb, sector 267177832^M [30627.418715][ 86] sd 3:0:0:1: rejecting I/O to offline device^M [30627.420021][ 86] print_req_error: I/O error, dev sdc, sector 209772368^M [30627.421253][ 86] print_req_error: I/O error, dev sdc, sector 209772376^M [30627.438674][ 86] sd 3:0:0:0: rejecting I/O to offline device^M [30627.439942][ 86] print_req_error: I/O error, dev sdb, sector 1745360^M [30627.441180][ 86] print_req_error: I/O error, dev sdb, sector 788207504^M [30627.442345][ 86] print_req_error: I/O error, dev sdb, sector 526731600^M [30627.443517][ 86] print_req_error: I/O error, dev sdb, sector 263186528^M [30627.444683][ 86] print_req_error: I/O error, dev sdb, sector 3239552^M [31446.409648][ 28] print_req_error: I/O error, dev sdb, sector 267177832^M [31467.420538][ 28] print_req_error: I/O error, dev sdb, sector 267177832^M [31476.890512][ 61] print_req_error: I/O error, dev sdc, sector 209772368^M [31476.891710][ 61] print_req_error: I/O error, dev sdc, sector 209772376^M [31482.488033][ 84] sd 3:0:0:1: rejecting I/O to offline device^M [31482.489254][ 84] print_req_error: I/O error, dev sdc, sector 209772392^M [31482.490507][ 84] print_req_error: I/O error, dev sdc, sector 209772384^M [31482.491733][ 84] print_req_error: I/O error, dev sdc, sector 209772400^M [31482.492918][ 84] print_req_error: I/O error, dev sdc, sector 209772408^M [31482.494051][ 84] sd 3:0:0:0: rejecting I/O to offline device^M [31482.495054][ 84] print_req_error: I/O error, dev sdb, sector 270107808^M [31482.496109][ 84] print_req_error: I/O error, dev sdb, sector 270106784^M [31482.497193][ 84] print_req_error: I/O error, dev sdb, sector 788207504^M [31871.417820][ 4] INFO: task xfsaild/dm-4:402347 blocked for more than 1200 seconds.^M [31871.419496][ 4] Tainted: G OE 4.19.90-52.30.v2207.ky10.aarch64 #1^M [31871.421055][ 4] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.^M [31871.422705][ 4] xfsaild/dm-4 D 0 402347 2 0x00000628^M [31871.423726][ 4] Call trace:^M [31871.424436][ 4] __switch_to+0xe8/0x150^M [31871.425293][ 4] __schedule+0x2b0/0x768^M [31871.426152][ 4] schedule+0x30/0xf0^M [31871.426917][ 4] xfs_log_force+0x170/0x358^M [31871.427809][ 4] xfsaild_push+0x5a8/0x6c0^M [31871.428766][ 4] xfsaild+0x11c/0x238^M [31871.429627][ 4] kthread+0x134/0x138^M [31871.430447][ 4] ret_from_fork+0x10/0x18^M [31871.431339][ 4] INFO: task containerd:409512 blocked for more than 1200 seconds.^M [31871.433064][ 4] Tainted: G OE 4.19.90-52.30.v2207.ky10.aarch64 #1^M [31871.434889][ 4] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.^M [31871.436710][ 4] containerd D 0 409512 1 0x00000608^M [31871.437906][ 4] Call trace:^M [31871.438822][ 4] __switch_to+0xe8/0x150^M [31871.439785][ 4] __schedule+0x2b0/0x768^M [31871.440741][ 4] schedule+0x30/0xf0^M [31871.441658][ 4] io_schedule+0x20/0x90^M [31871.442570][ 4] wait_on_page_bit+0x134/0x178^M [31871.443548][ 4] __filemap_fdatawait_range+0xd0/0x120^M [31871.444618][ 4] file_write_and_wait_range+0xb0/0xd8^M [31871.445655][ 4] xfs_file_fsync+0x58/0x1d8^M [31871.446584][ 4] vfs_fsync_range+0x4c/0x90^M [31871.447743][ 4] do_fsync+0x48/0x78^M [31871.448623][ 4] sys_fdatasync+0x24/0x38^M [31871.449503][ 4] __sys_trace_return+0x0/0x4^M [31871.450841][ 4] Kernel panic - not syncing: hung_task: blocked tasks^M [31871.451999][ 4] CPU: 4 PID: 748 Comm: khungtaskd Tainted: G OE 4.19.90-52.30.v2207.ky10.aarch64 #1^M [31871.453937][ 4] Source Version: ccdbfc2c55f0fb8dde14fae29155446fc8a7e941^M [31871.455059][ 4] Hardware name: GreatWall æ\x93\x8e天EF860/GW-748F2A-FTG, BIOS Great Wall BIOS V3.0 2022-11-17^M [31871.457098][ 4] Call trace:^M [31871.457908][ 4] dump_backtrace+0x0/0x1b0^M [31871.458911][ 4] show_stack+0x24/0x30^M [31871.459845][ 4] dump_stack+0xb4/0xf0^M [31871.460748][ 4] panic+0x130/0x310^M [31871.461654][ 4] watchdog+0x2b8/0x468^M [31871.462536][ 4] kthread+0x134/0x138^M [31871.463337][ 4] ret_from_fork+0x10/0x18^M [31871.464354][ 4] SMP: stopping secondary CPUs^M [31871.584509][ 4] Kernel Offset: 0x37db061e0000 from 0xffff000048000000^M [31871.585577][ 4] CPU features: 0x10,a0000008^M [31871.586428][ 4] Memory Limit: none^M [31871.587456][ 4] Rebooting in 10 seconds..^M |

分析第二次出现I/O error

日志中先出现了多个print_req_error: I/O error, dev dm-4/5/6, sector xxxx,说明lvm逻辑卷输入输出出现错误。而后出现了XFS (dm-4): writeback error on sector xxxx,XFS 文件系统报告了writeback error和metadata I/O error,特别是在xlog_iodone操作中,错误发生在不同的扇区,涉及数据和元数据的读写操作。错误代码为error 5,对应于输入/输出错误(EIO),这表示底层存储设备无法完成读写请求。

而后马上出现了shutdown及reboot: Restarting system相关日志,这应是之前执行了关机操作。虽然是先出现了IO error的日志,而后出现的shutdown日志,但是shutdown操作并不会立即写到日志,推测是关机操作使得多路径存储设备下线,文件系统无法写入,而后出现的IO error。

|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [4931293.526231][ 71] print_req_error: I/O error, dev dm-4, sector 5428128^M [4931293.526436][ 70] print_req_error: I/O error, dev dm-5, sector 15354992^M [4931293.527010][ 71] XFS (dm-4): writeback error on sector 5428240^M [4931293.527704][ 70] XFS (dm-5): writeback error on sector 15355120^M [4931293.528811][ 70] print_req_error: I/O error, dev dm-5, sector 16527680^M [4931293.529516][ 70] XFS (dm-5): writeback error on sector 16527808^M [4931293.530124][ 70] print_req_error: I/O error, dev dm-5, sector 24740872^M [4931293.530762][ 70] print_req_error: I/O error, dev dm-5, sector 27572552^M [4931293.531550][ 70] XFS (dm-5): writeback error on sector 272373048^M [4931293.532123][ 70] XFS (dm-5): writeback error on sector 276753512^M [4931293.532702][ 70] XFS (dm-5): writeback error on sector 531646336^M [4931293.533384][ 70] XFS (dm-5): writeback error on sector 24740992^M [4931293.533961][ 70] XFS (dm-5): writeback error on sector 27572656^M [4931293.534656][ 70] XFS (dm-5): writeback error on sector 288670704^M [4931293.535240][ 70] XFS (dm-5): writeback error on sector 288670720^M [4931293.538236][ 66] XFS (dm-4): metadata I/O error in "xlog_iodone" at daddr 0xc8086b0 len 64 error 5^M [4931293.539266][ 66] XFS (dm-4): Log I/O Error Detected. Shutting down filesystem^M [4931293.539586][ 50] XFS (dm-5): metadata I/O error in "xlog_iodone" at daddr 0x1f402070 len 64 error 5^M [4931293.540003][ 50] XFS (dm-4): Please umount the filesystem and rectify the problem(s)^M [4931293.541683][ 50] XFS (dm-5): Log I/O Error Detected. Shutting down filesystem^M [4931293.542370][ 50] XFS (dm-5): Please umount the filesystem and rectify the problem(s)^M [4931294.409340][ 99] XFS (dm-6): metadata I/O error in "xlog_iodone" at daddr 0x1f45ad90 len 64 error 5^M [4931314.007532][ 42] shutdown: 9 output lines suppressed due to ratelimiting^M [4931314.118056][ 47] dracut Warning: Killing all remaining processes^M [4931314.508042][ 47] dracut Warning: Unmounted /oldroot.^M [4931315.028993][ 75] kauditd_printk_skb: 8 callbacks suppressed^M [4931328.960882][ 0] reboot: Restarting system^M |

分析第三次出现I/O error

和第二次出现IO error相同,出现IO error后,也立马出现了shutdown相关的日志。

|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [2001517.911050][102] print_req_error: I/O error, dev dm-6, sector 62984704^M [2001517.911712][100] print_req_error: I/O error, dev dm-5, sector 218187936^M [2001517.912383][102] XFS (dm-6): writeback error on sector 62984832^M [2001517.913028][100] print_req_error: I/O error, dev dm-5, sector 216240008^M [2001517.913138][100] print_req_error: I/O error, dev dm-5, sector 6241168^M [2001517.913144][100] print_req_error: I/O error, dev dm-5, sector 7116376^M [2001517.913166][100] XFS (dm-5): writeback error on sector 6241240^M [2001517.913170][100] XFS (dm-5): writeback error on sector 7116472^M [2001517.913174][100] XFS (dm-5): writeback error on sector 7116568^M [2001517.913229][100] print_req_error: I/O error, dev dm-5, sector 6772344^M [2001517.913254][100] XFS (dm-5): writeback error on sector 6772472^M [2001517.913258][100] XFS (dm-5): writeback error on sector 6772520^M [2001517.913263][100] XFS (dm-5): writeback error on sector 338549088^M [2001517.913392][100] XFS (dm-5): writeback error on sector 110130576^M [2001517.913398][100] XFS (dm-5): writeback error on sector 110130584^M [2001517.923828][101] XFS (dm-5): metadata I/O error in "xlog_iodone" at daddr 0xc82f840 len 64 error 5^M [2001517.925072][101] XFS (dm-5): Log I/O Error Detected. Shutting down filesystem^M [2001517.925237][103] XFS (dm-6): metadata I/O error in "xlog_iodone" at daddr 0x1f44c278 len 64 error 5^M [2001517.925257][101] XFS (dm-6): metadata I/O error in "xlog_iodone" at daddr 0x1f44c2b8 len 64 error 5^M [2001517.925419][103] XFS (dm-6): Log I/O Error Detected. Shutting down filesystem^M [2001517.925421][103] XFS (dm-6): Please umount the filesystem and rectify the problem(s)^M [2001517.925840][103] XFS (dm-5): Please umount the filesystem and rectify the problem(s)^M [2001518.084461][ 59] overlayfs: failed to get xattr trusted.overlay.redirect: err=-5)^M [2001518.134555][ 29] overlayfs: failed to get xattr trusted.overlay.redirect: err=-5)^M [2001518.134679][ 8] overlayfs: failed to get metacopy (-5)^M [2001518.135513][ 29] overlayfs: failed to get xattr trusted.overlay.redirect: err=-5)^M [2001518.137716][ 17] overlayfs: failed to get xattr trusted.overlay.redirect: err=-5)^M [2001518.138605][ 17] overlayfs: failed to get xattr trusted.overlay.redirect: err=-5)^M [2001518.141203][ 71] overlayfs: failed to get metacopy (-5)^M [2001518.143645][ 54] overlayfs: failed to get metacopy (-5)^M [2001518.144968][ 95] overlayfs: failed to get metacopy (-5)^M [2001518.145177][ 46] overlayfs: failed to get xattr trusted.overlay.redirect: err=-5)^M [2001518.146489][ 46] overlayfs: failed to get xattr trusted.overlay.redirect: err=-5)^M [2001518.148168][ 72] overlayfs: failed to get metacopy (-5)^M [2001518.148170][ 34] overlayfs: failed to get metacopy (-5)^M [2001518.148259][ 34] overlayfs: failed to get metacopy (-5)^M [2001518.150749][ 96] overlayfs: failed to get metacopy (-5)^M [2001518.152389][116] overlayfs: failed to get metacopy (-5)^M [2001528.063127][ 97] systemd-shutdown[1]: Waiting for process: local-path-prov^M [2001600.084921][123] kauditd_printk_skb: 23 callbacks suppressed^M [2001613.949707][ 99] shutdown: 7 output lines suppressed due to ratelimiting^M [2001614.020310][100] dracut Warning: Killing all remaining processes^M [2001614.364201][100] dracut Warning: Unmounted /oldroot.^M [2001628.676636][ 0] reboot: Restarting system^M |

查看tuned.log,出现了两次关机操作,应该是对应这两次IO error。

分析结果

第一次出现I/O错误时,日志中显示了 rejecting I/O to offline device的消息。这通常意味着设备已经离线,无法继续进行I/O操作,从而导致了I/O错误。由于这个错误的日志已经存在一段时间,问题应该已经得到解决。

第二次和第三次出现I/O错误时,伴随着关机相关的日志信息。具体来说,关机日志(shutdown 和 reboot)表明系统在I/O错误发生前,应该有执行关机操作。系统关机通常会触发文件系统的卸载和数据的回写,但由于多路径存储设备已经下线,XFS 无法完成日志写回和其他数据的持久化操作。日志中的XFS writeback error和metadata I/O error表明XFS文件系统在尝试进行数据写回时失败。尤其是在元数据操作过程中,XFS 在执行xlog_iodone操作时未能完成I/O请求,导致元数据无法成功写回。这些错误的原因是,XFS尝试将数据写入物理存储设备,但由于设备处于离线状态或无法访问,文件系统无法完成这些操作。

相关推荐
码农小韩1 小时前
基于Linux的C++学习——指针
linux·开发语言·c++·学习·算法
wdfk_prog1 小时前
[Linux]学习笔记系列 -- [fs]seq_file
linux·笔记·学习
Jay Chou why did2 小时前
wsl安装完无法进入wsl
linux
石头5303 小时前
Rocky Linux 9.6 docker k8s v1.23.17 kubeadm 高可用部署文档
linux
松涛和鸣3 小时前
49、智能电源箱项目技术栈解析
服务器·c语言·开发语言·http·html·php
凉、介3 小时前
SylixOS 中的 Unix Socket
服务器·c语言·笔记·学习·嵌入式·sylixos
RisunJan3 小时前
Linux命令-ipcs命令(报告进程间通信(IPC)设施状态的实用工具)
linux·运维·服务器
春日见4 小时前
控制算法:PP(纯跟踪)算法
linux·人工智能·驱动开发·算法·机器学习
HABuo4 小时前
【Linux进程(四)】进程切换&环境变量深入剖析
linux·运维·服务器·c语言·c++·ubuntu·centos