Little Intuition

2021-10-08

This is the 60th anniversary of the proof of Little’s Law, a widely useful rule for queuing systems that links throughput, response (cycle) time, and number of requests (work items). Unfortunately, although software engineers often operate queues or pipelines (modern computer systems are rife with queues, implicit and explicit), they aren’t exposed to the mathematical foundations and often fail to build intuition around how queues act operationally. In this post, I aim to describe the Law, applications within software systems, and build an intuition in its use.

The equation

Little’s Law has two equivalent formulations: \(L = \lambda W\) and \(WIP = TH \times CT\). The first formulation is that the average number of items within the system (\(L\)) is equal to the average arrival rate of items (\(\lambda\)) times the average waiting time (\(W\)) of an item. The second formulation is the average number of work in progress (WIP) items is equal to the average throughput (TH) times the average cycle time (CT). The two formulations are equivalent, with the first focusing on “inputs” to the system with the second focusing on “outputs”.

Importantly, the waiting time or cycle time includes the time spent processing in addition to the time in queue. Similarly, \(L\) and WIP include items sitting in queues as well as items being actively worked. All three values are expressed as averages; the law by itself won’t provide insight into ranges of those values.

Some intuition

First, logically, none of the three variables can be negative.

Secondly, cycle time will have some minimal value, \(CT_0\), that represents the time to move an item through the pipeline with no schedule delay. However, cycle time can grow almost arbitrarily large if the amount of work exceeds the system’s capacity.

Thirdly, throughput will have some maximal value, \(TH_\Omega\), that represents the maximum throughput manageable by the pipeline. (If you are modeling the system via the arrival rate, there might not be a maximum value although systems will need to reject requests at some point.)

For the system to perform some work (for throughput to be positive), some work needs to be released into the system. If too much work is released, cycle time will grow and the system will bounce around its maximum throughput. However, it is not necessary (nor likely) for throughput to climb linearly and for cycle time to only increase once throughput is saturated.

For instance, if a system has a minimal cycle time of 2 seconds and a maximum throughput of 4 units/second, then an ideal WIP, TH, and CT may act as:

WIP	TH	CT	Notes
0	0	2	\(CT_0\)
1	1/2	2
2	1	2
3	2/3	2
4	2	2
6	3	2
8	4	2	\(TH_\Omega\) reached; end of flat \(CT_0\)
16	4	4	CT grows
32	4	8

However, a system could also behave like this:

WIP	TH	CT	Notes
0	0	2
1	1/2	2
2	2/3	3	CT grows, \(TH_\Omega\) not yet reached
3	1	3
4	1	4
6	6/5	5
8	4/5	10
16	4/3	12
32	2	16

A divergence from an ideal progression can indicate losses due to queuing overhead or failures but can also be attributed to natural inefficiencies and cost of variability.

Because these numbers are averages, outliers can greatly influence the values. The law of small numbers and large numbers apply to observations. We plan to write more about this later, but the variability in a system (e.g. cycle times) greatly impacts the operator’s ability to control the system within some service level objective.

Applications

Some applications that requiring solving for or controlling each of the three variables:

L: If you need to estimate sizing requirements, like the size of the queue or how much disk will be required, you will be computing L via arrival rate and wait time.
CT: If you want to control response time/cycle time, you will estimate this value based on a count of accepted work and system throughput.
TH: If you want to control throughput, you will be using work release strategies to improve batching and trying to improve cycle time.

Little Intuition

The equation

Some intuition

Applications

Further Reading