Three Dimensions of Storage Sizing & Design – Part 4: Throughput
In this fourth post of the multi-post Three Dimensions of Storage Sizing & Design we will dive deeper in the dimension: Use and specifically the application Workloads part characteristic Throughput. Depending on the application workload requirements, you will need to size the storage system for Bandwidth. So let us dive deeper into MB/s.
What is Throughput?
Throughput – measured in Megabytes per second, defines the data transfer rate. Often this is also called bandwidth, but there is a distinct difference between throughput and bandwidth. Bandwidth is the raw capability of a channel to move data through that channel. Throughput is how much data actually does travel through the channel successfully. Several different things including latency, protocol used and packet size can limit this.
For example you have a 4-lane freeway between Source City and Destination City. Tollgates are placed on this route. Two tollgates will open so the bandwidth will be 2, when more tollgates are opened the bandwidth increases. People can use cash, credit card or electronic toll collection (ETC). If everyone uses ETC every second a car can pass the tollgates. So throughput is also 2. Cash and credit card delays the amount of cars that can pass the tollgates per second to 1.5 or even 1, so the effective throughput is 1, where bandwidth will be 2.
The I/O data path
The maximum throughput is depended on the complete I/O data path the data travels through. It is like an hourglass where the choke point will determine the maximum amount of throughput for the whole path.
To size and design storage for throughput, you will have to look at the whole I/O path and determine where the choke points are. The choke point with the smallest bandwidth will determine the maximum amount of throughput for the whole path. Lets take a look at a highly simplified I/O data path.
In this example there are roughly two categories, compute devices and communication connections.
(A, B, C) – Are all compute devices which have memory and CPU power onboard to process the In and Out going data. Also the network equipment for client network and storage network belong to this category.
(1,2,3,4,5) – Are all communication connections between two devices.
In the I/O data path 99% of the time the limiting factor for bandwidth is the communication connection. Processing power (CPU) and Memory are much faster than communication connections can deliver the data.
The Central Processing Unit (CPU), in a compute device, interprets and executes instructions. How fast a CPU works depends partly on the clock speed, which is the speed at which the CPU executes instructions. The faster the clock, the more instructions the CPU can execute per second.
It is the CPU speed, measured in hertz (GHz). For example a 1.80GHz CPU has a clock which ticks 1.8 billion times each second. 1.800.000.000 but the CPU is bound by the memory in use and the maximum it can address/use. In the technical specifications for the CPU you will see a metric called Max Memory Bandwidth.
For example an Intel Xeon Processor E5-2630L v4 has a Max Memory Bandwidth of 68.3 GB/s where an average PC3-12800 DDR3 SDRAM memory module will have a max bandwidth of 12.8 GB/s. Compare these numbers with a 10 Gbps Ethernet network connection, which has a maximum bandwidth of 1.25GB/s and you will see the bottleneck will be the communication channel.
Bytes or bits per second?
Data transfer rate or throughput is a speed, usually measured in bits or bytes per second. This is often the root cause for miss-calculations and the wrong expectations. For CPU and Memory you will see a large B, which means its bytes per second. For network connections you will see 1 Gb/s so a small b, which means Giga bits per second. So 1 Gb/s = 1000 megabits per second = 1000/8 = 125 MB/s or 0.125 GB/s
Numbers to work with
For design and sizing the following numbers around SAS and Network connections can be handy to know.
For calculating throughput from disks in a JBOD it is depended on the protection level (RAID) and how the disks are being accessed (Random / Sequential). If you look up the manufactures specifications use 50% for the sustainable throughput. Disclaimer: this is a rule of thumb for sizing and calculation purposes only. Caching, latency, queue depth, protection levels and new architectures will influence the numbers and even increase the throughput you can achieve.
Bandwidth is the maximum amount of data that can travel through a ‘channel’. Throughput is how much data actually does travel through the ‘channel’ successfully. This can be limited by a ton of different things including latency, and what protocol you are using.
The bandwidth is determined by the properties of the link itself. Latency is a function of how long it takes the data to get sent all the way from the start point to the end, the processing time at the destination and the time to send back a response.
In the next part of Three Dimensions of Storage Sizing & Design we will dive deeper into workload characteristic Response also called latency (ms).
Interesting resources if you want to dive deeper and get to know the background:
- How a CPU works – blog post
- Everything you need to know about memory architectures – this blog post.
- List of device bit rates for a lot of devices and communication channels on – Wikipedia
- Memory bandwidth – Wikipedia
- Factors Affecting Processor Speed – Course
- Serial Attached SCSI – Wikipedia
Other articles in the series Three Dimensions of Storage Sizing & Design:
- Three Dimensions of Storage Sizing & Design – Part 1: Overview
- Three Dimensions of Storage Sizing & Design – Part 2: Workloads
- Three Dimensions of Storage Sizing & Design – Part 3: Speed
- Three Dimensions of Storage Sizing & Design – Part 4: Throughput
- Interview with a NetApp Cloud Architect | VMGuru TV by Anne Jan Elsinga
- Do you have the right insights? by Dimitri De Swart
- Perception is everything by Dimitri De Swart
- Accelerating Application Security with Network… by Martijn Smit
- Kubernetes persistent volumes with NetApp Trident - Part 2 by Dimitri De Swart