Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.

For the best experience please use the latest Chrome, Safari or Firefox browser.

Performance Tuning
Chris Mason

Benchmarking and Tuning

Common problems
Good benchmark choices
Run analysis
Best practices


Random 4KB Writes

Explaining the results...

Know Your Hardware

Collect performance specs
Verify components individually
FIO directly to the devices first
FIO, fsmark, and FFSB for filesystem based workloads

Monitoring the CPUs

More cores means more contention
Work may not be evenly distributed
mpstat can watch individual CPUs
Watch for high IRQ or softIRQ times

More devices can spread scsi locks over CPUS

For AIO/DIO, spread AIO contexts over devices and processes

Perf can show locks and other kernel bottlenecks

perf top
perf record -g -a -f sleep 10
perf record -g -C N -f sleep 10
perf report -g

Latencytop can help find non-IO waiting

Process fs_mark (25206) Total: 10.3 msec

[btrfs_balance_delayed_items] 3.1 msec 60.0 %
btrfs_balance_delayed_items [btrfs] btrfs_btree_balance_dirty
[btrfs] btrfs_create [btrfs] vfs_create do_last.isra.40
path_openat do_filp_open do_sys_open sys_open system_call_fastpath


Higher queue depths can increase latencies
Ftrace can provide detailed timing information for each function call

trace-cmd and kernelshark help sift through traces

On spindles, IO patterns matter...
Thank You

Chris Mason <>

Use a spacebar or arrow keys to navigate