Not All Gigabytes Are Created Equal – A Storage Buyer’s Warning
“We can sell you a solution today that will provide you with 1 trillion IOPS worth of performance at a price of just a penny a gig!” Ok, perhaps that’s a bit of an exaggeration, but the unfortunate truth is that these kinds of claims often make their way into the marketplace. As customers work hard to compare and contrast the varying marketing pitches from different companies, it becomes next to impossible to get a real, apples to apples look at what’s going on under the hood.
When $/GB Is Not Really $/GB
Price is always at or near the top of mind when it comes to any significant purchase, which would include storage. As such, it’s natural to ask about a ballpark overall price of a solution, if for no other reason than to not waste a vendor’s time if the cost is simply not possible. Very often, when asked about storage pricing, vendors will respond with something along the lines of, “Oh, we can get to $x to $y per gig.” Those that aren’t storage-savvy will go on their merry way with that figure in mind and might actually use it in comparisons against other storage providers.
That price tag that you were quoted may be very, very far from reality and may be all but useless in some comparisons. It’s important to be very aware of exactly what the vendor is saying when talking about cost. If they use the phrase “per usable gigabyte” anywhere, that means that the price being quoted is achieved only after some other action has taken place. For example, the vendor may be quoting you a capacity price that is possible only after applying “normal” data reduction ratios. If they vendor’s customers regularly see a 5:1 data reduction through such features as data deduplication and compression, that $/GB figure might factor in that ratio. As such, the “raw” price per GB might be $5x/GB, but the vendor is quoting $x/GB. The problem: Not all data reduction happens evenly across all customers, so this average value that the vendor uses will not apply to all scenarios and could lead to customers buying tool little (or too much) storage capacity to fit their needs.
There are some key questions that buyer’s must ask storage vendors when considering the capacity side of the storage equation:
- What is the total RAW capacity of the storage device? That is, how much storage is there in the device before applying any kind of overhead or ratio? If the device has ten 1 TB hard disks, that would be a 10 TB storage array.
- What is the typical data reduction seen from customers for different kinds of workloads? For example, if customer A plans to use a new all-flash array for virtual desktop usage, he will see a vastly different data reduction ratio than customer B, who plans to use it for general file storage.
- Does advertised data capacity include any high availability mechanisms that are necessary?
How Vendors Can Help
On the vendor front, some have taken a more proactive approach than others and have provided details for how they arrive at their per-GB pricing. However, I propose that all vendors should adhere to providing certain minimum details so that customers can more easily compare and contrast solutions. Such information should include:
- Raw storage capacity.
- Average deduplicated capacity (not best case).
- After-reduction $/GB or capacity.
- Clearly and consistently identify each of these metrics to make comparisons easier.
My IOPS Is Bigger Than Your IOPS!
Of course, capacity is just one part of the storage equation. In fact, for some buyers, overall capacity might even be a secondary consideration, as is often the case with customers that have a true need for all-flash storage arrays. If anyone thought that games could be played with the $/GB metric, just wait until the sheer lunacy of some performance claims is seen.
I’ve seen these kinds of claims and, while some are real, it’s the rest of the sentence that is really important. “We can do a million IOPS… under these very specific conditions.” In fact, the issue of comparing storage performance metrics is far from new. Back in the mid to late 90s, this was recognized as a significant issue, which led to the creation of the Storage Performance Council, which has the following mission:
The Storage Performance Council (SPC) is a non-profit corporation founded to define, standardize, and promote storage subsystem benchmarks as well as to disseminate objective, verifiable performance data to the computer industry and its customers.
Although the vendors that submit their equipment to be tested are in complete control of the configuration of the hardware, the result is usable in comparisons because the SPC-1 benchmark requires so much transparency from said vendors that comparisons become much easier. On the down side, if vendors don’t feel that their gear would hold up or if they don’t want to spend the money to test and validate, there will be no SPC-based performance comparisons. Further SPC-1 benchmarks are single configuration validations that could vary wildly if that configuration is changed even just a little bit. Here’s a bit more about the SPC-1 benchmark and it’s value to customers and the industry.
I’m not going to say that comparing vendor performance claims absent a neutral third party is an easy feat. To make comparisons a bit easier, ask vendors how they arrived a certain benchmarks:
- What was the read/write ratio? Workloads that heavily favor reads will tend to perform better in most circumstances.
- What RAID level is in use? Different RAID levels have very different impact on storage.
- What is the block size in use? Different block sizes will have major impact on overall storage performance metrics.
- What kinds of disks are being used? Is this just HDD, just SSD, or a combination of both?
- What kind of caching and acceleration is taking place under the hood
- Is the measured configuration one that could be considered “real world” in a customer environment?
If you’re looking at vendors that participate in SPC benchmarks, carefully review the submitted configurations and understand that these results are third party validated.
The best way that you can measure overall storage performance is to obtain a trial unit and run your own benchmarks where possible.
If you’re in the market for storage, don’t fall victim to misleading marketing! Not all vendors are guilty of this sin and there are many out there that take great pains to make sure that customers know exactly how specific metrics were derived. That said, always look for ways to independently validate vendor claims so that you don’t end up getting burned in the long run.