A number of people have asked me over time, “What’s the process or methodology you use in sizing a NetApp storage array?” There are really two ways I can answer that question. In most cases, I don’t have time to sit down and fully explain how NetApp utilizes disk capacity and I just resign to showing them NetApp’s Synergy tool or the NetApp sizers that are widely available on NetApp TechNet. Ultimately, I would really love for everyone to know how NetApp utilizes third-party disks from Western Digital, Seagate, Hitachi, etc. to utlimately get to the usable capacity available in an aggregate. All of that being said, capacity is not the only concern here. There’s much more that goes into sizing a NetApp storage array including, but not limited to, IOPs, disk throughput, and a ratio of read/write performance. So, I’ve decided to sit down and write a series of articles detailing how all of this is done. Hopefully you’ll find this useful.
I’ll start with the most obvious component of this here in part 1 of this series: Capacity.
When you buy a hard disk drive from NewEgg, BestBuy, MicroCenter, Fry’s, etc., you’re buying disks that are advertized as 500GB, 1TB, 2TB, 3TB, etc. That being said, we all know that once we take that drive home, we’re not actually going to get the full capacity that is advertized on the box. Why is that? Well, disk drive manufactuers are really good at marketing. 😉 Computers use base-2 numbers for EVERYTHING. Somewhere along the way, disk manufacturers decided to jump-the-shark and start using base-10 when it came to marketing the capacity of their disk drives. For example, a 1TB SATA drive is not really 1024GB in size. It’s actually 1,000,000,000,000 bytes. When you do the math that 1,000,000,000,000 converts to 931.32GB (1,000,000,000,000 / 1024 / 1024 / 1024). This affects ALL drives no matter who’s array you’re using (NetApp, EMC, HP, IBM, etc.).
The second part of this can begin to get a little confusing. It has to do with error-correction codes and their use on enterprise-class disk arrays such as NetApp. NetApp uses two types of disk drives – SAS disks and SATA disks. The manufacturers of these disks typically create SAS disks for business use and SATA disks for both business and home user (consumer) use. Given that fact, they include some additional functionality on SAS disks to allow enterprise-class disk subsystems to consistently validate the consistency of data on disks. They do this through something called a checksum (nothing more than a hash value that they can reference to validate data consistency). That checksum, on SAS disks, is stored in the same sector as data (over simplifying here, but this is close). The sector size on SAS disks is large enough to incorporate this (520 bytes per sector) and, as such, you really don’t see a reduction in capacity on the drives after doing the base-10 to base-2 sizing conversion. SATA disks is where it gets a bit fuzzy. Home users don’t necessarily need checksums. So, disk manufacturers don’t include that extra space within the sectors to accomodate checksums (512 bytes per sector). Well, what do enterprise-class storage vendors do in order to use checksums on SATA disks? They use up an extra 9th sector for every 8 sectors used! This eats away SATA usable capacity much faster than SAS capacity. When using SATA disks, you have to take the size you get from the base-10 to base-2 sizing conversion and multiply that by 8/9. Back to that 1TB SATA disk that we were using as an example, after base-10 to base-2 conversion we were left with 931.32GB. After checksum allocation, we’re left with 827.8GB of usable capacity.
After we account for overhead in marketed capacity and possible overhead for block-level checksums, the last thing to account for is NetApp’s implementation of WAFL. WAFL, NetApp’s proprietary filesystem used on all FAS-based disk arrays, requires some overhead to allow for efficiencies such as dedupe, compression, thin provisioning, and write performance. To do this WAFL reserves 10% of usable disk space for its own use. Again, using our 1TB SATA drive example, the total remaining usable capacity available after WAFL reservations comes to 745GB.
You might be thinking, “That’s a lot of consumed capacity even before I get to put my data on those disks!” Yes, it is, but kicker is in the efficiencies created by Data ONTAP and WAFL after-the-fact. Dedupe, compression, thin provisioning, etc. will create greater gains in usable capacity rather than the capacity taken away by the process detailed above.
Here’s a quick table of common drive sizes, de-marketed capacity, post block-level checksum allocation, and post-WAFL reservations:
|Marketed Capacity||De-Marketed Capacity||Post Block-level Checksum||Post WAFL Reservation|
|300GB SAS||279.4 GB||N/A||251.5 GB|
|450GB SAS||419.1 GB||N/A||377.2 GB|
|600GB SAS||558.8 GB||N/A||502.9 GB|
|1TB SATA||931.3 GB||827.8 GB||745 GB|
|2TB SATA||1,862.6 GB||1,655.7 GB||1,490.1 GB|
|3TB SATA||2,794 GB||2,483.5 GB||2,235.2 GB|
In Part 2 of this series I’ll discuss sizing for performance and the “art-based-upon-science” aspects of trying to determine how an array will react under certain workloads.