一、smartctl工具简介
Smartmontools是一种硬盘检测工具,通过控制和管理硬盘的SMART(Self Monitoring Analysis and Reporting Technology),自动检测分析及报告技术)技术来实现的,SMART技术可以对硬盘的磁头单元、盘片电机驱动系统、硬盘内部电路以及盘片表面介质材料等进行监测,当SMART监测并分析出硬盘可能出现问题时会及时向用户报警以避免计算机数据受损失。SMART技术必须在主板支持的前提下才能发生作用,而且SMART技术也不能保证能预报所有可能发生的硬盘故障。Windows没有内置SMART相关工具,需要安装第三方工具软件,vmware虚拟机的硬盘不支持SMART,Linux上很早就有了SMART支持了,可以yum命令安装该工具即可,smartctl是Smartmontools工具安装之后的可执行命令,我们通过此命令可以查看磁盘是否支持smart检测,执行smart检测等。
二、使用示例
1、命令安装
root@s186 /\]# yum install -y smartmontools 2、查看磁盘是否支持smart \[root@s186 /\]# smartctl -i /dev/sda #Enabled表示启用了SMART #Available表示硬盘支持SMART 3、启用SMART \[root@s210 \~\]# smartctl --smart=on --offlineauto=on --saveauto=on /dev/sda smartctl 7.0 2018-12-30 r4883 \[x86_64-linux-3.10.0-1062.el7.x86_64\] (local build) Copyright © 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF ENABLE/DISABLE COMMANDS SECTION === SMART Enabled. SMART Attribute Autosave Enabled. SMART Automatic Offline Testing Enabled every four hours. 4、查看硬盘的所有SMART信息 \[root@s210 \~\]# smartctl -a /dev/sda 5、查看硬盘的健康状况 \[root@s210 \~\]# smartctl -H /dev/sda smartctl 7.0 2018-12-30 r4883 \[x86_64-linux-3.10.0-1062.el7.x86_64\] (local build) Copyright © 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED #请注意result后边的结果:PASSED,这表示硬盘健康状态良好,如果这里显示Failure,那么最好立刻给服务器更换硬盘。 6、查看设备SMART厂商属性和值 \[root@s210 \~\]# smartctl -A /dev/sda smartctl 7.0 2018-12-30 r4883 \[x86_64-linux-3.10.0-1062.el7.x86_64\] (local build) Copyright © 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 142 142 054 Pre-fail Offline - 68 3 Spin_Up_Time 0x0007 122 122 024 Pre-fail Always - 185 (Average 189) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 715 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 115 115 020 Pre-fail Offline - 34 9 Power_On_Hours 0x0012 099 099 000 Old_age Always - 12687 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 372 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 830 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 830 194 Temperature_Celsius 0x0002 193 193 000 Old_age Always - 31 (Min/Max 7/41) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 7、显示硬盘历史错误信息 \[root@s210 \~\]# smartctl -l error /dev/sda smartctl 7.0 2018-12-30 r4883 \[x86_64-linux-3.10.0-1062.el7.x86_64\] (local build) Copyright © 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Error Log Version: 1 No Errors Logged 8、后台执行smartctl测试 \[root@s210 \~\]# smartctl --test=long /dev/sda smartctl 7.0 2018-12-30 r4883 \[x86_64-linux-3.10.0-1062.el7.x86_64\] (local build) Copyright © 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 119 minutes for test to complete. Test will complete after Tue Oct 12 17:14:21 2021 Use smartctl -X to abort test. 9、前台执行smart自测 (base) \[root@s186 /\]# smartctl -C -t short /dev/sda smartctl 7.0 2018-12-30 r4883 \[x86_64-linux-3.10.0-957.5.1.el7.x86_64\] (local build) Copyright © 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Short self-test routine immediately in captive mode". Drive command "Execute SMART Short self-test routine immediately in captive mode" successful. Testing has begun. Please wait 1 minutes for test to complete. Test will complete after Tue Oct 12 16:03:19 2021 10、中断smart自测 \[root@s210 \~\]# smartctl -X /dev/sda smartctl 7.0 2018-12-30 r4883 \[x86_64-linux-3.10.0-1062.el7.x86_64\] (local build) Copyright © 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Abort SMART off-line mode self-test routine". Self-testing aborted! 11、显示smart自测日志 (base) \[root@s186 /\]# smartctl -l selftest /dev/sda 三、使用语法和参数说明 1、使用语法 #smartctl \[options\] device 2、参数说明 1)、信息显示参数选项 -h, --help, --usage 获取命令帮助 -V, --version, --copyright, --license 打印显示软件版本、授权等信息 -i, --info 显示指定设备的身份信息 -g NAME, --get=NAME 查看设备设置值,name可选值包括all, aam, apm, dsn, lookahead, security,wcache, rcache, wcreorder, wcache-sct -a, --all 打印显示设备的所有smart信息 -x, --xall 打印显示设备的所有信息 --scan 扫描磁盘设备 --scan-open 扫描磁盘设备并参数开启设备 2)、smart运行参数选项 -j, --json\[=\[cgiosuv\]\] 打印输出为json格式 -q TYPE, --quietmode=TYPE 安静模式,TYPE可选值为errorsonly, silent, noserial -d TYPE, --device=TYPE 指定设备类型,TYPE可选值为ata, scsi\[+TYPE\], nvme\[,NSID\], sat\[,auto\]\[,N\]\[+TYPE\], usbcypress\[,X\], usbjmicron\[,p\]\[,x\]\[,N\], usbprolific, usbsunplus, sntjmicron\[,NSID\], intelliprop,N\[+TYPE\], marvell, areca,N/E, 3ware,N, hpt,L/M/N, megaraid,N, aacraid,H,L,ID, cciss,N, auto, test -T TYPE, --tolerance=TYPE 公差类型,可选值为normal, conservative, permissive, verypermissive -b TYPE, --badsum=TYPE 设置校验和有错的扇区执行操作,可选TYPE值有warn, exit, ignore -r TYPE, --report=TYPE 报告事务设置 -n MODE\[,STATUS\], --nocheck=MODE\[,STATUS\] 检查介绍后的操作never, sleep, standby, idle 3)、设备smart功能启停参数选项 -s VALUE, --smart=VALUE 开启或禁用设备device功能,VALUE值为on/off -o VALUE, --offlineauto=VALUE 开启或者禁用离线测试,VALUE值为on/off -S VALUE, --saveauto=VALUE 开启或者禁用属性自动保存,VALUE值为on/off -s NAME\[,VALUE\], --set=NAME\[,VALUE\] 开启或者关闭指定类型设备 4)、读取和显示数据参数选项 -H, --health 查看设备smart健康状况 -c, --capabilities 查看设备smart能力 -A, --attributes 查看生成厂商smart属性和属性值 -f FORMAT, --format=FORMAT 设置输出格式属性 -l TYPE, --log=TYPE 查看指定类型日志,常用日志类型error, selftest, selective, directory,background, scttemp\[sts,hist
-v N,OPTION , --vendorattribute=N,OPTION 设置供应商属性N的显示选项
5)、磁盘自测参数选项
-t TEST, --test=TEST TEST可选值包括offline, short, long, conveyance, force, vendor,N,select,M-N, pending,N, afterselect,[on|off]
-C, --captive 捕获模式下运行,即前台运行
-t short 后台检测硬盘,消耗时间短
-t long 后台检测硬盘,消耗时间长
-C -t short 前台检测硬盘,消耗时间短
-C -t long 前台检测硬盘,消耗时间长
-X, --abort 中断任何后台自测