测试硬盘的瑞士军刀-fio

FIO (Flexible I/O Tester) 是一个用于磁盘I/O性能测试的强大工具。它的参数众多,我们不可能也没必要完全记住。下面通过一个具体的命令演示。通过这个具体的例子,希望能帮助你对fio有所了解。

下面命令模拟数据库服务对硬盘的使用,I/O块选择为8k,读写比设置为7比3。

fio完整命令:

bash 复制代码
fio --name=oltp-sim \
    --ioengine=libaio \
    --direct=1 \
    --bs=8k \
    --size=3G \
    --rw=randrw \
    --rwmixread=70 \
    --iodepth=32 \
    --numjobs=4 \
    --group_reporting \
    --runtime=120 \
    --time_based \
    --ramp_time=30

输出结果:

复制代码
oltp-sim: (g=0): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=32
...
fio-3.36
Starting 4 processes
oltp-sim: Laying out IO file (1 file / 3072MiB)
oltp-sim: Laying out IO file (1 file / 3072MiB)
oltp-sim: Laying out IO file (1 file / 3072MiB)
oltp-sim: Laying out IO file (1 file / 3072MiB)
Jobs: 4 (f=4): [m(4)][100.0%][r=161MiB/s,w=70.5MiB/s][r=20.6k,w=9026 IOPS][eta 00m:00s]
oltp-sim: (groupid=0, jobs=4): err= 0: pid=54727: Thu Jun 25 09:23:16 2026
  read: IOPS=38.7k, BW=302MiB/s (317MB/s)(35.4GiB/120002msec)
    slat (nsec): min=785, max=3517.1k, avg=5204.50, stdev=9422.74
    clat (usec): min=55, max=201863, avg=2136.30, stdev=6055.59
     lat (usec): min=153, max=201866, avg=2141.50, stdev=6055.62
    clat percentiles (usec):
     |  1.00th=[   685],  5.00th=[   914], 10.00th=[  1045], 20.00th=[  1254],
     | 30.00th=[  1450], 40.00th=[  1696], 50.00th=[  1893], 60.00th=[  2057],
     | 70.00th=[  2212], 80.00th=[  2376], 90.00th=[  2638], 95.00th=[  2868],
     | 99.00th=[  3490], 99.50th=[  4146], 99.90th=[102237], 99.95th=[189793],
     | 99.99th=[200279]
   bw (  KiB/s): min=53589, max=374230, per=100.00%, avg=310733.49, stdev=22506.96, samples=944
   iops        : min= 6696, max=46777, avg=38840.01, stdev=2813.37, samples=944
  write: IOPS=16.6k, BW=130MiB/s (136MB/s)(15.2GiB/120002msec); 0 zone resets
    slat (nsec): min=913, max=3425.1k, avg=5941.29, stdev=11093.02
    clat (usec): min=301, max=201695, avg=2711.91, stdev=6906.19
     lat (usec): min=324, max=201698, avg=2717.85, stdev=6906.22
    clat percentiles (usec):
     |  1.00th=[  1106],  5.00th=[  1434], 10.00th=[  1631], 20.00th=[  1844],
     | 30.00th=[  2040], 40.00th=[  2180], 50.00th=[  2343], 60.00th=[  2474],
     | 70.00th=[  2638], 80.00th=[  2835], 90.00th=[  3130], 95.00th=[  3425],
     | 99.00th=[  4359], 99.50th=[  5342], 99.90th=[103285], 99.95th=[193987],
     | 99.99th=[200279]
   bw (  KiB/s): min=21968, max=163208, per=100.00%, avg=133093.00, stdev=9657.89, samples=944
   iops        : min= 2744, max=20400, avg=16635.00, stdev=1207.25, samples=944
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.13%, 750=1.04%, 1000=4.66%
  lat (msec)   : 2=41.95%, 4=51.32%, 10=0.59%, 20=0.01%, 50=0.11%
  lat (msec)   : 100=0.02%, 250=0.17%
  cpu          : usr=3.95%, sys=10.33%, ctx=2211187, majf=0, minf=149
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=4644062,1989240,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
   READ: bw=302MiB/s (317MB/s), 302MiB/s-302MiB/s (317MB/s-317MB/s), io=35.4GiB (38.0GB), run=120002-120002msec
  WRITE: bw=130MiB/s (136MB/s), 130MiB/s-130MiB/s (136MB/s-136MB/s), io=15.2GiB (16.3GB), run=120002-120002msec
Disk stats (read/write):
  sdd: ios=5931652/2543396, sectors=94906432/40694712, merge=0/317, ticks=11765239/6307868, in_queue=18073398, util=96.20%

选项介绍:

参数 说明
--name oltp-sim 测试任务名称,表示模拟OLTP(联机事务处理)场景
--ioengine libaio 使用Linux原生异步I/O引擎,性能更高
--direct 1 绕过操作系统缓存,直接读写磁盘(绕过Page Cache)
--bs 8k 块大小为8KB,数据库常见page size
--size 3G 每个job测试文件大小为3GB
--rw randrw 随机读写混合模式
--rwmixread 70 70%读,30%写(典型OLTP比例)
--iodepth 32 I/O队列深度32
--numjobs 4 并发4个进程
--group_reporting - 汇总所有job的统计而非单独显示
--runtime 120 测试运行120秒
--time_based - 基于时间运行,即使完成size也继续跑满时间
--ramp_time 30 预热30秒后才开始统计(消除初始波动)

读取(Read)性能

复制代码
read: IOPS=38.7k, BW=302MiB/s (317MB/s)(35.4GiB/120002msec)
指标 说明
IOPS 38.7k 每秒读取38,700次I/O操作
带宽 (BW) 302 MiB/s 读取吞吐量(注意MiB是1024进制,MB是1000进制)
总读取量 35.4 GiB 120秒内总共读取35.4GB数据

延迟指标(latency)

复制代码
slat (nsec): min=785, max=3517.1k, avg=5204.50, stdev=9422.74
clat (usec): min=55, max=201863, avg=2136.30, stdev=6055.59
 lat (usec): min=153, max=201866, avg=2141.50, stdev=6055.62
缩写 全称 单位 平均值 说明
slat Submission Latency 纳秒 5.2 μs I/O提交延迟:fio将I/O请求提交给内核的时间
clat Completion Latency 微秒 2136 μs 完成延迟:I/O提交到内核到完成的时间
lat Total Latency 微秒 2141 μs 总延迟 ≈ slat + clat

延迟百分位数(关键!)

复制代码
clat percentiles (usec):
 |  1.00th=[   685],  5.00th=[   914], 10.00th=[  1045], 20.00th=[  1254],
 | 30.00th=[  1450], 40.00th=[  1696], 50.00th=[  1893], 60.00th=[  2057],
 | 70.00th=[  2212], 80.00th=[  2376], 90.00th=[  2638], 95.00th=[  2868],
 | 99.00th=[  3490], 99.50th=[  4146], 99.90th=[102237], 99.95th=[189793],
 | 99.99th=[200279]

解读:

  • P50 = 1893 μs:50%的读请求在1.89ms内完成(中位数)
  • P99 = 3490 μs:99%的读请求在3.49ms内完成
  • P99.9 = 102 ms:最差的0.1%请求需要超过100ms(尾延迟!)
  • P99.99 = 200 ms:极端情况下有200ms延迟

💡 重要:P99+的尾延迟非常高(100-200ms),这可能是磁盘瓶颈或GC导致的,对数据库性能影响很大。

带宽和IOPS分布

复制代码
bw (KiB/s): min=53589, max=374230, avg=310733.49, stdev=22506.96
iops: min=6696, max=46777, avg=38840.01, stdev=2813.37
  • 采样944次,IOPS波动范围6.7k ~ 46.8k,标准差2813

写入(Write)性能

复制代码
write: IOPS=16.6k, BW=130MiB/s (136MB/s)(15.2GiB/120002msec)
指标 说明
IOPS 16.6k 每秒写入16,600次
带宽 130 MiB/s 写入吞吐量
总写入量 15.2 GiB 120秒内写入15.2GB

写入延迟

复制代码
slat (nsec): avg=5941.29  (~5.9 μs)
clat (usec): avg=2711.91  (~2.7 ms)
lat (usec):  avg=2717.85  (~2.7 ms)

写入平均延迟2.7ms,比读取的2.1ms略高,符合预期。

写入延迟百分位数

复制代码
| 99.00th=[  4359], 99.50th=[  5342], 99.90th=[103285], ...
  • P99写入延迟 = 4.36ms(读取是3.49ms)
  • P99.9同样有100ms+的尾延迟

延迟分布直方图

复制代码
lat (usec): 100=0.01%, 250=0.01%, 500=0.13%, 750=1.04%, 1000=4.66%
lat (msec): 2=41.95%, 4=51.32%, 10=0.59%, 20=0.01%, 50=0.11%
lat (msec): 100=0.02%, 250=0.17%

关键发现:

  • ~93%的请求在4ms内完成(1ms内4.66% + 2ms内41.95% + 4ms内51.32%)
  • ~0.3%的请求超过50ms(尾延迟)
  • 总体分布:中间密集(2-4ms),尾部有很长的拖尾

CPU使用率

复制代码
cpu: usr=3.95%, sys=10.33%, ctx=2211187, majf=0, minf=149
指标 说明
usr 3.95% 用户态CPU占比低(fio本身计算量小)
sys 10.33% 内核态CPU占比(处理I/O)
ctx 2,211,187 总上下文切换次数(约1.8万次/秒)
majf 0 无主缺页(无磁盘swap)
minf 149 次缺页(内存分配)

I/O深度统计

复制代码
IO depths: 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%

100%时间保持iodepth=32,说明磁盘能跟上队列深度,没有出现队列积压不足的情况。

复制代码
submit: 0=0.0%, 4=100.0%, ...
complete: 0=0.0%, 4=100.0%, ...

批量提交/完成:每次批量处理4个I/O,符合libaio的批处理特性。


I/O总量统计

复制代码
issued rwts: total=4644062,1989240,0,0
  • 读操作:4,644,062次
  • 写操作:1,989,240次
  • 读写比例 = 70:30 ✓(完全符合rwmixread=70的配置)

磁盘级别统计(Disk stats)

复制代码
sdd: ios=5931652/2543396, sectors=94906432/40694712, merge=0/317, 
     ticks=11765239/6307868, in_queue=18073398, util=96.20%
指标 说明
util 96.20% 磁盘利用率96.2%,接近饱和!
ios 5.9M / 2.5M 磁盘层面的读/写I/O数
merge 0 / 317 读I/O完全没有合并,写有少量合并
in_queue 18M I/O在队列中的总等待时间(毫秒)

⚠️ util=96.2%是关键指标:磁盘已经接近100%繁忙,这是P99.9延迟飙升的根本原因!