A practitioners way to judge utilization numbers

Cary Millsap has written a paper Thinking Clearly About Performance where he presents a collection of fundamental principles about performance problem solving. The paper addresses many different key aspects that need to be understood to be successful in the field of performance monitoring and tuning. In my opinion the sections about queueing delays and the M/M/m model contain valuable information that every performance practitioner should be aware of.

The M/M/m model contains the mathematical background about how a perfect system with m identical service channels (e.g. CPUs) behaves. The model is interesting because it is able to describe one of the key performance pitfalls: the utilization where performance starts to degrade.

Let’s look at a specific example and find out how the model describes a single disk drive. The following graph contains the data for a SAS drive with 3.4ms access time and a SATA drive with 8.5ms access time. The model assumes a random access pattern and shows the total latency (wait/queueing time + access time) for different loads.


You can see that with an increase in the demand the latency increases more and more. It is probably obvious that you don’t want to run with such a high latency when you have an application with high performance requirements. The model is so interesting because the specific curve only depends on the access time that is given in the disk specification.

In his paper Cary Millsap does the math the other way round. He gives the utilization at which the trouble starts (commonly called the knee of the curve). There is no sound definition of the knee in the paper and I have seen discussions about different opinions to find a clear definition for the knee. But from a practical point of view it might not really be necessary to do that. Let’s just take the numbers as a rule of thumb where performance degradation is likely to happen.

Cary Millsap has calculated the following values:

Service channels Knee utilization
1 50%
2 57%
4 66%
8 74%
16 81%
32 86%
64 89%
128 92%

A single disk drive obviously has only one service channel, so the maximum utilization should be 50% according to the data. In my example this maps to 150 IOPS for the SAS drive and 60 IOPS for the SATA drive.

Models with multiple service channels are valid for CPUs for example. From the table we therefore conclude that a server with 4 cores hits this limit at about 66% utilization.

Just remember that the numbers should be treated as a rule of thumb as the M/M/m model describes a perfect system while real hardware and software seldom is perfect. So the real world numbers are probably a lower than the ones you find in the table. Nevertheless I consider them helpful when you want to know the utilization limit that a system can handle without performance degradation.