Observation
Same k6 script produced different results on two machines: 8 GB laptop vs. 32 GB laptop. The discrepancy pointed to the load generator itself being the bottleneck, not the system under test.
Why this happens
Each k6 VU is a single-threaded JavaScript runtime with its own event loop. When the host runs out of CPU or memory, the event loops can’t process callbacks efficiently — this shows up as artificial latency in results, making the target system look slower than it is.
k6 can handle 30,000–40,000 VUs on a well-resourced machine, but on a constrained host even a few hundred VUs can saturate it.
Warning
Skewed results from a saturated load generator are silent — the test completes and reports numbers, but those numbers reflect the generator’s limits, not the target system’s behaviour.
Signals that the generator is the bottleneck
- Results differ significantly between machines with different RAM/CPU.
- CPU utilisation on the test machine hits ~100% during the test.
- Latency percentiles are suspiciously uniform or plateau in a way that mirrors CPU saturation, not server behaviour.
Guidelines
| Metric | Safe threshold |
|---|---|
| Load generator CPU | < 80% |
| Load generator memory | < 90% |
- Scale gradually — start with a low VU count and ramp up, watching generator metrics before reading target metrics.
- Prefer arrival-rate executors (
constant-arrival-rate,ramping-arrival-rate) over VU-based executors for high-throughput tests — they give more predictable resource usage. - Pre-allocate VUs with
preAllocatedVUsrather than relying onmaxVUsto scale mid-test, which can cause sudden resource spikes. - Run cloud tests on a consistent spec — don’t compare results across machines with different RAM unless you’ve verified the generator was not the bottleneck on both.
See also
- k6-large-tests — OS tuning, hardware thresholds, script optimisations, and distributed execution
- k6-metrics-and-config — metric interpretation, connection reuse, tagged sub-metrics, validation workflow