How to measure disk performance with fio and IOPing

Whether it’s a server, or a PC for work, what usually limits performances is disk speed. Even if using SSDs, their speed is not yet comparable to that of RAM and CPU.
There are different tools with or without a graphical interface, written for testing disks speed. There are also people who use

, for example:

dd if=/dev/zero of=test_file bs=64k count=16k conv=fdatasync

However, in our opinion dd is the worst software for benchmarking I/O performance.

it is a single-threaded, sequential-write test. Of course, if running a web server, services do not do long-running sequential writes, and use more than one thread
it writes a small amount of data, so the result can be influenced by caching or by RAID’s controller
it executes for just a few seconds, and everyone knows that in this way it’s not possible to have consistent results
there are no reading speed tests

All these points just lead to one conclusion: better to use anything else. For disk benchmarking there are two kind of parameters that give a complete overview: IOPS (I/O Per Second) and latency. This tutorial explains how to measure IOPS with

fio

, and disk latency with

IOPing

on a RHEL 7 system.

Install fio

First of all, install the EPEL repository:

# wget https://mirrors.n-ix.net/fedora-epel/epel-release-latest-7.noarch.rpm
# yum localinstall epel-release-latest-7.noarch.rpm

Next, install fio with yum:

# yum install fio

Testing IOPS with fio

RW Performance

The first test is for measuring random read/write performances. In a terminal, execute the following command:

# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75

During the test, the terminal window will display an output like the following one:

test: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.2.8
Starting 1 process
test: Laying out IO file(s) (1 file(s) / 4096MB)
Jobs: 1 (f=1): [m(1)] [0.1% done] [447KB/131KB/0KB /s] [111/32/0 iops] [eta 01h:Jobs: 1 (f=1): [m(1)] [0.1% done] [383KB/147KB/0KB /s] [95/36/0 iops] [eta 01h:4Jobs: 1 (f=1): [m(1)] [0.1% done] [456KB/184KB/0KB /s] [114/46/0 iops] [eta 01h:Jobs: 1 (f=1): [m(1)] [0.1% done] [624KB/188KB/0KB /s] [156/47/0 iops] [eta 01h:Jobs: 1 (f=1): [m(1)] [0.1% done] [443KB/115KB/0KB /s] [110/28/0 iops] [eta 01h:Jobs: 1 (f=1): [m(1)] [0.1% done] [515KB/95KB/0KB /s] [128/23/0 iops] [eta 01h:4Jobs: 1 (f=1): [m(1)] [0.1% done] [475KB/163KB/0KB /s] [118/40/0 iops] [eta 01h:Jobs: 1 (f=1): [m(1)] [0.2% done] [451KB/127KB/0KB /s] [112/31/0 iops]

So, the program will create a 4GB file (

--size=4G

), and perform 4KB reads and writes using three reads for every write ratio (75%/25%, as specified with option

--rwmixread=75

), split within the file, with 64 operations running at a time. The RW ratio can be adjusted for simulating various usage scenarios.
At the end, it will display the final results:

test: (groupid=0, jobs=1): err= 0: pid=4760: Thu Mar  2 13:23:28 2017
  read : io=7884.0KB, bw=864925B/s, iops=211, runt=  9334msec
  write: io=2356.0KB, bw=258468B/s, iops=63, runt=  9334msec
  cpu          : usr=0.46%, sys=2.35%, ctx=2289, majf=0, minf=29
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=0.6%, 32=1.2%, >=64=97.5%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=1971/w=589/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=7884KB, aggrb=844KB/s, minb=844KB/s, maxb=844KB/s, mint=9334msec, maxt=9334msec
  WRITE: io=2356KB, aggrb=252KB/s, minb=252KB/s, maxb=252KB/s, mint=9334msec, maxt=9334msec

Disk stats (read/write):
    dm-2: ios=1971/589, merge=0/0, ticks=454568/120101, in_queue=581406, util=98.44%, aggrios=1788/574, aggrmerge=182/15, aggrticks=425947/119120, aggrin_queue=545252, aggrutil=98.48%
  sda: ios=1788/574, merge=182/15, ticks=425947/119120, in_queue=545252, util=98.48%

Random read performance

In this case, the command is:

# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read.fio --bs=4k --iodepth=64 --size=4G --readwrite=randread

The output will be similar to the RW case, just specialized in the read case.

Random write performance

# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randwrite

As above, in random write case.

Latency measures with IOPing

As stated in the introduction, the second part of a benchmark is the latency measurement. To accomplish this task, install IOPing, also available in the EPEL repository.

# yum install ioping

Execute it:

# ioping -c 100 .

The

-c 100

option is the number request ioping will make. The program takes also as argument the file and/or device to check. In this case, the actual working directory. Program output is:

4 KiB <<< . (xfs /dev/dm-2): request=1 time=16.3 ms (warmup)
4 KiB <<< . (xfs /dev/dm-2): request=2 time=253.3 us
4 KiB <<< . (xfs /dev/dm-2): request=3 time=284.0 ms
...
4 KiB <<< . (xfs /dev/dm-2): request=96 time=175.6 us (fast)
4 KiB <<< . (xfs /dev/dm-2): request=97 time=258.7 us (fast)
4 KiB <<< . (xfs /dev/dm-2): request=98 time=277.6 us (fast)
4 KiB <<< . (xfs /dev/dm-2): request=99 time=242.3 us (fast)
4 KiB <<< . (xfs /dev/dm-2): request=100 time=36.1 ms (fast)

--- . (xfs /dev/dm-2) ioping statistics ---
99 requests completed in 3.99 s, 396 KiB read, 24 iops, 99.3 KiB/s
generated 100 requests in 1.65 min, 400 KiB, 1 iops, 4.04 KiB/s
min/avg/max/mdev = 163.5 us / 40.3 ms / 760.0 ms / 118.5 ms

Last line shows the latency measures of the disk.

What is STP? - Explain Advantages and Disadvantages

The Spanning Tree Protocol is a network protocol that builds a loop-free logical topology for Ethernet networks. The basic function of STP is to prevent bridge loops and the broadcast radiation that results from them. STP is a protocol. It actively monitors all links of the network. To finds a redundant link, it uses an algorithm, known as the STA (spanning-tree algorithm). The STA algorithm first creates a topology database then it finds and disables the redundant links. Once redundant links are disabled, only the STP-chosen links remain active. If a new link is added or an existing link is removed, the STP re-runs the STA algorithm and re-adjusts all links to reflect the change. STP (Spanning Tree Protocol) automatically removes layer 2 switching loops by shutting down the redundant links. A redundant link is an additional link between two switches. A redundant link is usually created for backup purposes. Just like every coin has two sides, a redundant link, along with...

netwyman

Search This Blog