Saturday, March 6, 2010

What's the best pool to build with 3 or 4 disks?

I've been asked many times in variations, "I just started using EON and I'm new to opensolaris, What's the best pool to build with 3 or 4 disks"? I usually answer, it depends! Credit that reflex answer to Prof. Gordon, one of the best Calculus and Differential equations teacher that walked in my time. May the force be with you, wherever you are!

I'll use Richard Elling's research to explain. Let's say I have 500Gb drives, with IOPs (for avg, small, random, cache-miss, read I/O operations per sec) = 70.59 and max media bandwidth of 133Mbytes/s(includes read and write). What can we build?
RAID Type   Disks   Sets   Storage Space   Performance (IOPS)   Max BW(Mbytes/s)
RAIDZ       4       1      3x500=1500Gb    4x70.59/3=94         3x133=399
RAIDZ       3       1      2x500=1000Gb    3x70.59/2=106        2x133=266 (1 spare)
STRIPE      4       1      4x500=2000Gb    4x70.59=282          4x133=532
STRIPE      3       1      3x500=1500Gb    3x70.59=212          3x133=399
MIRROR      2       2      2x500=1000Gb    4x70.59=282          4x133=532
RAIDZ
With 4 disks in the first raidz set, we get higher bandwidth (399Mbytes/s) vs the 3 disk raidz bandwidth (266Mbytes/s), but the 3 disk raidz pool has a higher I/O operations per second capability. Note, as "sets" are added to the 3 disk raidz (3 disks each time) the difference of IOPS between the 4 disk raidz widens. If you exhaust the usable storage space, it will cost 4 or 3 times the cost of a drive for each new "Set", to add or grow the storage. So the 3 drive raidz has a more economical cost per set. This can be repeated to add more "Sets" or more storage and bandwidth as needed. So this is a very flexible choice. The change with 1 additional set would look like.
RAID Type   Disks   Sets   Storage Space   Performance (IOPS)   Max BW(Mbytes/s)
RAIDZ       4       2      3000Gb          188                  798
RAIDZ       3       2      2000Gb          212                  532
Both 4 and 3 disk raidz allows only 1 disk to fail but if all disks had the same probability of failure then the 4 disk raidz pool would have a higher probability of a failure than the 3 disk version.

STRIPE
Has great bandwidth numbers, usable storage and IOPS, but any disk failure would cause the pool to fail and lose ALL your data. Did I mention that good storage is NOT a substitute for a GOOD backup? This pool is not easily expanded when the usable storage is exhausted and offers no data redundancy.

MIRROR
Has great bandwidth numbers, a higher cost per usable storage and allows failure of 2 disks. It has roughly twice the write bandwidth and up to 4 times the read performance as ZFS is capable of reading from all disks in the mirror in parallel. This configuration will most likely provide the best balance of performance and data protection at the expense of disks or usable storage. Expanding or growing this pool when the usable storage is being exhausted, is also simple.


Hopefully this will help architect pools that suits your workload, cost dynamics and growth needs.