A number of years back I saw a mail thread around Java performance. One person made the claim that Java was slow; if you wanted performant apps then write in a different language. This matched my experience, so I felt a little confirmation bias. However the reply really made me think; this may have been true 10 years ago (and, indeed, it’d been over 10 years since I’d done any real Java) but modern techniques meant that Java was perfectly fast. The kicker was that, instead of believing rumour and word-of-mouth ancient results, do the tests yourself; measure and get the facts.
My home server disks
My home machine has mutated a few times over the years, and suffers some technical debt as a result.
Currently the disks are connected to a couple of SAS9211-8i controllers. In theory these are PCIe 8x controllers but my motherboard has a 16x slot and a 4x slot, so one card isn’t being used at full potential. I’m using the controllers in JBOD mode, with Linux mdraid to create the various RAID options. I prefer this because it means I’m not tied to specific hardware and could migrate the disks to another setup, entirely, and the kernel should autodetect and bring up the RAID automatically.
This means the setup is currently on a CentOS 6 based system with the cards and disks distributed over the two cards:
- 16x slot:
- 2 Crucial CT512MX1 SSDs in a RAID 1
- 4 Seagate ST2000 in a RAID 10
- 4x slot:
- 8 Seagate ST4000 in a RAID 6
Now I’d been taking it as an article of faith that the SSDs would be fastest, with the RAID 10 being better than the RAID 6 for writes, and the RAID 6 being better for reads.
When I built this I ran some
hdparm -t
tests on the RAID volumes and was surprised how
well the RAID 6 performed
The two SSDs are in a RAID 1 and get 375 MB/sec
The 4*2Tbytes are in a RAID 10 and get 267 MB/sec
The 8*4Tbytes are in a RAID 6 and get 532 MB/sec
This made me wonder how well they’d work in a more targeted
test. Since EPEL has bonnie++
in the repository I chose that.
For each of the RAID volumes I picked an existing ext3
filesystem
to run the tests on. This should give some form of “real world”
feel, since it wouldn’t be on newly created filesystems.
I ran each test twice.
The results (slightly edited for formatting) make interesting reading:
------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
Raid6 800 92 108776 14 103820 11 3276 76 606148 23 170.3 8
Raid6 784 91 108436 13 105046 11 3373 76 606068 23 172.2 8
Raid10 753 94 175905 21 91956 9 2700 76 227793 10 409.2 15
Raid10 745 91 170551 20 93408 9 3209 78 229931 10 531.6 7
Raid1 SSD 786 96 182078 21 97995 10 3685 84 236123 10 460.2 17
Raid1 SSD 796 95 173306 20 98546 10 3012 75 233678 11 470.9 16
What seems clear is that for reading the ability to distribute load over 8 disks gives a clear speed advantage when reading data blocks, but at a CPU cost. Sequential character reads don’t make a lot of difference, so we may be seeing limiting factors of the kernel and CPU and memory, rather than disk I/O here. RAID 6 definitely loses for random seeks, though!
More interesting are the writes. RAID 6 is definitely slower, but surprisingly the SSD wasn’t noticeably faster than the spinning 2TB disks. These disks are meant to be able to do 500MB/s, but I appear to get real-world speeds of 180MB/s, only a little faster than the 2TB Seagates (and those values seem reasonable, looking at this) chart.
And why are the rewrites tipped the other way?
At this point I’d almost consider moving all my data on the RAID 6 because I think that’d give the most balanced performance! However, performance isn’t the only design criteria; data risk (if a 4TB disk failed would the array survive long enough to rebuild the lost disk?) and data separation; I could turn off the RAID 6 and still keep 95% of my functionality, just long term storage wouldn’t be available, so I could run on a smaller machine if this motherboard died.
Summary
This “get the facts” can apply to many things. It becomes tempting to measure everything and try to apply change based on those numbers. But we can see, from this simple disk speed test, that numbers may not be so clear and what you measure can give you different results. If I only measure “block read” speed then I would just have everything on the RAID 6 disks; that’s so much quicker! But if I care about writing then maybe the RAID 10. Would I even care about the SSDs? And there are other factors (resiliency, recovery) to take into account. Overly optimising for one factor isn’t always the best idea, either!
Technology changes; configurations can have impacts; optimisations (JIT bytecode compilers) can add a whole new dimension. What was an article of faith 10 years ago may now hinder you and cause you to build slower more complicated solutions.
If possible you should try out various options early in the design process and pick the one that gives you best results for your use case. You need the facts in order to make the best decision. But don’t over-optimise at the cost of other factors. Decisions are not made in a vacuum.