Capacity of a single instance
A single k6 process uses all available CPU cores. With proper tuning, one machine can handle:
- 30,000–40,000 simultaneous VUs
- Up to 300,000 RPS
Unless you need more than 100,000–300,000 RPS, a single instance is sufficient. The theoretical socket limit on a single Linux machine is ~65,535 connections per IP address.
OS fine-tuning (Linux)
Run as root before starting the test. Resets on reboot.
sysctl -w net.ipv4.ip_local_port_range="1024 65535"
sysctl -w net.ipv4.tcp_tw_reuse=1
sysctl -w net.ipv4.tcp_timestamps=1
ulimit -n 250000Note
ulimitapplies only to the current shell session.sysctlchanges apply system-wide until the network service restarts or the machine reboots. See the k6 “Fine tuning OS” article for making these permanent and for macOS instructions.
Hardware thresholds
| Resource | Safe threshold | What happens if exceeded |
|---|---|---|
| CPU | < 80% | Results show artificially high latency |
| Memory | < 90% | System starts swapping, results become unreliable |
| Network | < NIC limit | Test is bandwidth-bound, not SUT-bound |
Memory per VU estimates:
- Simple tests: ~1–5 MB/VU (1,000 VUs ≈ 1–5 GB)
- File upload tests: tens of MB/VU
Warning
Disable swap — if the system runs out of RAM and starts swapping, performance becomes erratic in different parts of the test, invalidating results. Plan RAM capacity upfront instead.
Monitoring the load generator
Open separate terminals for k6, CPU/memory, and network while the test runs.

Recommended tools:
- CPU + memory:
htopornmon - Network:
iftopornmon
k6 options to reduce resource usage
Discard response bodies
export const options = {
discardResponseBodies: true,
};k6 loads response bodies into memory by default. Discard them unless your checks need the body. Override per-request with Params.responseType.
Skip local summary when streaming to cloud
k6 cloud run scripts/website.js \
--local-execution \
--vus=20000 \
--duration=10m \
--no-thresholds \
--no-summaryWithout these flags, thresholds and summary are computed both locally and in the cloud — duplicated work.
Script optimisations
- Avoid deeply nested loops.
- Minimise custom metrics (Trend, Counter, Gauge, Rate) — each value is recorded separately.
- Minimise checks and groups at very high VU counts — each result is recorded individually.
- Use URL grouping to avoid a new time-series object per unique URL.
- Use
SharedArrayor an external store (Redis) to share data across VUs instead of copying large objects into each VU. - Remove
abortOnFailfrom thresholds — k6 must evaluate it continuously.
Resilient error handling
At large scale the SUT may return empty or malformed responses. Guards like r.body.length will throw if r.body is undefined.
// fragile
check(res, { 'body is 11026 bytes': (r) => r.body.length === 11026 });
// resilient
check(res, { 'body is 11026 bytes': (r) => r.body && r.body.length === 11026 });Note
At large scale, some errors are always expected. 100 failures in 50M requests (0.00002%) is a good result — decide your acceptable error rate before the test.
Common errors
| Error | Cause |
|---|---|
read: connection reset by peer | SUT or load balancer rejected the connection — target is overloaded |
context deadline exceeded | SUT didn’t respond within the 60s default timeout |
dial tcp ... i/o timeout | TCP connection couldn’t be established |
socket: too many open files | Hit the OS file descriptor limit — increase ulimit -n |
Distributed execution
Split load across multiple machines using --execution-segment:
# Two machines
k6 run --execution-segment "0:1/2" --execution-segment-sequence "0,1/2,1" script.js
k6 run --execution-segment "1/2:1" --execution-segment-sequence "0,1/2,1" script.js
# Three machines
k6 run --execution-segment "0:1/3" --execution-segment-sequence "0,1/3,2/3,1" script.js
k6 run --execution-segment "1/3:2/3" --execution-segment-sequence "0,1/3,2/3,1" script.js
k6 run --execution-segment "2/3:1" --execution-segment-sequence "0,1/3,2/3,1" script.jsWarning
Distributed mode is not fully supported in OSS k6 — there is no built-in coordinator instance. Each k6 instance evaluates thresholds and reports metrics independently. Use
--no-thresholdson each instance and aggregate metrics yourself, or use the k6 Kubernetes operator.
See also
- k6-vu-saturation — event loop saturation and load generator as the bottleneck
- k6-os-tuning — ulimit, port range, TIME_WAIT, and memory tuning for large tests
- k6-metrics-and-config — metric interpretation, connection reuse, tagged sub-metrics, validation workflow