Wednesday, November 13, 2013

ZFS / RAIDZ Benchmarks - Part 2

Table of Content


  • Introduction
  • Hardware Setup
  • Description
  • Hiccups
  • Benchmarks
  • Intel NASPT
    • Bonnie++
    • iozone
    • phoronix test suite
  • Conclusions

Introduction


I became in need for a NAS solution, something to store all my files and provide a great level of safety and redundancy.

ZFS was always going to be my choice of filesystem; it worked well, provide lots of useful features (especially snapshots) and is very reliable.

I looked at the existing professional solutions, but none of them provided the flexibility I was after.
FreeNAS' backer iXsystems has an interesting FreeNAS Mini; but only allows a maximum of four disks; and their professional solutions was just outside my budget.


It's been a while since I had done a ZFS benchmarks (check zfs-raidz-benchmarks I wrote in 2009).
So I'm at it again.

Hardware Setup


- Supermicro SC826TQ-500LPB 12 bays chassis
- Supermicro X10SL7-F motherboard
- Intel E3-1220v3 processor
- 32GB RAM (made of 4*8GB Kingston DD3 1600MHz ECC KVR16E11/8EF)
- 6 x WD RED 4TB



from the top

 two chassis: 24 bays total


Description


The chassis comes with a 500W platinum rated redundant power supplies; it's rated at 94% for 20% load and 88% at 10% load. Even with 12 disks, it won't ever go over 25% load so this power supply is overkilled, but it's the smallest Supermicro has.

The X10SL7-F has 6 onboard SATA connectors plus a LSI 2308 SAS2 with 8 SAS/SATA ports.

ZFS shouldn't run on top of hardware RAID controller, it defeat the purpose of ZFS. The LSI was flashed with an IT firmware, making the 2308 a plain HBA.

The plan was to use RAIDZ2 (ZFS equivalent to RAID6), which provides redundancy for two simultaneous disk failures). RAIDZ2 with six 4TB disks would give me 16TB (14TiB) of available space.

The system could later be extended with six more disks... As this is going to be used as a MythTV storage center, I estimate that it will reach capacity in just one year (though MythTV perfectly handles auto-deleting low-priority recordings)

The choice came between the new Seagate NAS drives and WD. My primary concern was power consumption and noise: the WD being 5400 drives win power-wise, but the Seagate are a tiny bit more quiet. Anandtech review also found that IOPS on the WD Red were slightly better: this and the lower power consumption made me go for the WD: the noise difference being minimal.

While I like to fiddle with computers, I'm not as young as I used to and as such, I wanted something that would be easy to use and configure: so my plan was to use FreeNAS.

FreeNAS is a FreeBSD based distribution that makes everything web configurable... It's still not for the absolute noob, and requires that you have good understanding of the underlying file system: ZFS.

FreeNAS runs of a USB flash drive in read-only mode, and let you install all of the FreeBSD ports in a jail residing in the ZFS partitions..


Hiccups


My plans became slightly compromised once I put everything together and realise how noisy that setup was. The Supermicro chassis being enterprise-grade, its only concern is to keep everything cool. But damn, that thing is noisy: no way I could ever have this in any room or office.

There's nothing in the motherboard BIOS allowing you to change the fans speed. The IPMI web access let you choose the fan speed mode: but the choice ends up being between "Normal" which wouldn't let anyone sleep, and "Heavy" which would for sure wake-up the neighbours.

The fans on this motherboard are controlled by a Nuvoton NCT6776D.  On linux the w83627ehf kernel module let you control most of the PWM including the fans speed, unfortunately I found no equivalent on FreeBSD. So if I'm to run FreeBSD I would have to use an external voltage regulator to lower the fans: something that doesn't appeal to me.

Also, this motherboard and chassis provides SGPIO interface to control the SAS/SATA backplane and indicates the status of the drives. This is great to identify which disk is faulty as you can't always rely on the device name provided by the OS.

However, I connected all my drives to the LSI 2308 controller.
Despite my attempts, I couldn't get the front LEDs to show the status of the disk in FreeBSD. Something I could easily do under Linux using the sas2ircu utility...

I like FreeBSD, and always used it for servers, but its lack of hardware gimmick like this started to annoy me. As part of my involvement in the MythTV projects (www.mythtv.org), I have switched to Linux for all my home servers, and I've grown familiar to it over the years.

A few months ago, I would have never considered anything but FreeBSD as I wanted to use ZFS, however the ZFS On Linux (ZOL) project recently called their drivers as "ready for production".... So could it be that linux be the solution?

So FreeBSD or Linux?

I ran various benchmarks, here are the results...


Benchmarks



lz4 compression was enabled on all ZFS partitions.

Intel NAS Performance Toolkit (via Windows and samba: gigabit link)




FreeNASUbuntu RAIDZ2Ubuntu md+ext4
Test NameThroughput(MB/s)Throughput(MB/s)DifferenceThroughput(MB/s)
HDVideo_1Play93.402102.6269.88%101.585
HDVideo_2Play74.33195.03127.85%101.153
HDVideo_4Play66.39595.93144.49%99.255
HDVideo_1Record104.36987.922-15.76%208.868
HDVideo_1Play_1Record63.99197.15651.83%96.807
ContentCreation10.36110.4330.69%10.734
OfficeProductivity51.62756.86710.15%11.405
FileCopyToNAS56.4250.427-10.62%55.226
FileCopyFromNAS66.86885.65128.09%8.367
DirectoryCopyToNAS5.69216.812195.36%13.356
DirectoryCopyFromNAS19.12727.2942.68%0.638
PhotoAlbum10.40310.884.59%12.413

Bonnie++

FreeNAS 9.1


Version 1.97Sequential OutputSequential InputRandom
Seeks
Sequential CreateRandom Create
SizePer CharBlockRewritePer CharBlockNum FilesCreateReadDeleteCreateReadDelete
K/sec% CPUK/sec% CPUK/sec% CPUK/sec% CPUK/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU
ports64G19799741314745226586346995153176268612.6316++++++++++++++++++++++++1153736++++++++2203171
Latency55233us117ms4771ms129ms760ms817msLatency16741us78us126us145ms23us92942us

Ubuntu - ZOL 0.6.2


Version 1.97Sequential OutputSequential InputRandom
Seeks
Sequential CreateRandom Create
SizePer CharBlockRewritePer CharBlockNum FilesCreateReadDeleteCreateReadDelete
K/sec% CPUK/sec% CPUK/sec% CPUK/sec% CPUK/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU
ubuntu63G199991102305916865847749898153986266445.41016++++++++++++++++++++++++3017396++++++++++++++++
Latency50051us60474us326ms79062us93147us133msLatency20511us236us252us41664us10us356us

Ubuntu - MD - EXT4


Version 1.97Sequential OutputSequential InputRandom
Seeks
Sequential CreateRandom Create
SizePer CharBlockRewritePer CharBlockNum FilesCreateReadDeleteCreateReadDelete
K/sec% CPUK/sec% CPUK/sec% CPUK/sec% CPUK/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU/sec% CPU
ubuntu63G10869513749211133601757588557541815571.8716298520++++++++++++++++++++++++++++++++++++++++
Latency18937us146ms634ms23816us86087us141msLatency47us223us227us39us17us35us

iozone


This NAS will mostly deal with very big files, so let's specifically test those (ext4 perform especially poorly here)
started with iozone -o -c -t 8 -r 128k -s 4G


FreeNASUbuntu ZOL 0.6.2Ubuntu md+ext4
TitleKB/sKB/sKB/s
Children see throughput for 8 initial writers38100.2738065.716141.23
Parent sees throughput for 8 initial writers38096.537892.056140.56
Min throughput per process4762.324749.14767.58
Max throughput per process4763.044769.23767.74
Avg throughput per process4762.534758.21767.65
Min xfer4193664.00 KB4176640.00 KB4193536.00 KB
Children see throughput for 8 rewriters36189.4736842.595938.99
Parent sees throughput for 8 rewriters36189.1236842.265938.93
Min throughput per process4523.494602.7742.25
Max throughput per process4524.14609742.52
Avg throughput per process4523.684605.32742.37
Min xfer4193792.00 KB4188672.00 KB4192896.00 KB
Children see throughput for 8 readers4755369.064219519.44541271.02
Parent sees throughput for 8 readers4743778.254219187.22540861.93
Min throughput per process593043.56527155.6958198.66
Max throughput per process596547.44527774.5671080.23
Avg throughput per process594421.13527439.9367658.88
Min xfer4169728.00 KB4189696.00 KB3438592.00 KB
Children see throughput for 8 re-readers4421317.254648511.624961596.72
Parent sees throughput for 8 re-readers4416015.24648083.464593093.04
Min throughput per process539363.12580726.531874.36
Max throughput per process558968.815815854619527.5
Avg throughput per process552664.66581063.95620199.59
Min xfer4048000.00 KB4188288.00 KB29184.00 KB
Children see throughput for 8 reverse readers5082555.841929863.779773067.07
Parent sees throughput for 8 reverse readers4955348.811898050.519441166.11
Min throughput per process426778183929.87041.22
Max throughput per process991416.19381972.759710407
Avg throughput per process635319.48241232.971221633.38
Min xfer1879040.00 KB2075008.00 KB3072
Children see throughput for 8 stride readers561014.62179665.1911963150.31
Parent sees throughput for 8 stride readers559886.81179420.5411030888.26
Min throughput per process57737.7319092.5611788.88
Max throughput per process107340.936176.449238066
Avg throughput per process70126.8322458.151495393.79
Min xfer2268288.00 KB2221312.00 KB5376
Children see throughput for 8 random readers209240.1693627.6413201790.94
Parent sees throughput for 8 random readers209234.7493625.1212408594.53
Min throughput per process25897.3811702.1972349.58
Max throughput per process27949.8111704.49059793
Avg throughput per process26155.0211703.451650223.87
Min xfer3886464.00 KB4193536.00 KB34688
Children see throughput for 8 mixed workload91072.2924038.27Too Slow
Cancelled
Parent sees throughput for 8 mixed workload17608.6323877.52
Min throughput per process2305.942990.74
Max throughput per process20461.233021.75
Avg throughput per process11384.043004.78
Min xfer472704.00 KB4151296.00 KB
Children see throughput for 8 random writers36929.7737391.58
Parent sees throughput for 8 random writers36893.0536944.62
Min throughput per process4615.374643.6
Max throughput per process4617.824704.09
Avg throughput per process4616.224673.95
Min xfer4192128.00 KB4140416.00 KB
Children see throughput for 8 pwrite writers37133.2336726.14
Parent sees throughput for 8 pwrite writers37131.3136549.34
Min throughput per process4641.494575.27
Max throughput per process4641.974600.88
Avg throughput per process4641.654590.77
Min xfer4193920.00 KB4171008.00 KB
Children see throughput for 8 pread readers4943880.54806370.25
Parent sees throughput for 8 pread readers4942716.334805915.05
Min throughput per process617373.75595302.62
Max throughput per process619556.38603041.88
Avg throughput per process617985.06600796.28
Min xfer4179968.00 KB4140544.00 KB

phoronix test suite


results are available here (xml result file here)

Comparison including md+ext4 raid6 here


Conclusions


Ignoring some of the nonsensical data found by the benchmarks above which indicates pure cache effect; the ZFS On Linux drivers are doing extremely well, and Ubuntu manages on average to surpass FreeBSD: that was a surprise...

Maybe time to port FreeNAS to use linux as kernel? That would be a worthy project...