Before running
Run from a remote machine, not your laptop. Client CPU contention inflates http_req_tls_handshaking — can show 3s p99 when actual TLS cost is 58ms.
Stabilise the environment first. Check k9s for pods with high restart counts before starting. A pod restarting mid-test makes results uninterpretable.
Connection reuse
export const options = {
noConnectionReuse: true,
};By default, k6 reuses HTTP keep-alive connections across iterations within the same VU. This creates a race condition at stage boundaries:
- The server decides an idle connection has timed out and sends a TCP FIN to close it.
- While that FIN is still in-flight (or sitting unprocessed in the OS receive buffer), the VU sends a new request on the same socket.
- The server receives the request on an already-closed socket and responds with TCP RST.
- k6 sees a “connection reset by peer” or EOF — which looks like a server error but is just a connection lifecycle mismatch.
The root cause is a timeout mismatch: as long as the client timeout ≥ server timeout, the server can close first and trigger this race. Setting noConnectionReuse: true avoids it by opening a fresh connection per iteration, at the cost of slightly higher per-request overhead (TCP + TLS handshake).
TLS in UAT
export const options = {
insecureSkipTLSVerify: true,
};UAT environments might have inconsistent certs across pods. Without this, you get spurious TLS errors that look like server failures.
Tagging for per-endpoint metrics
k6 automatically applies a scenario system tag to every metric, but scenario tags don’t create per-endpoint sub-metric breakdowns — they just let you filter by scenario after the fact. To get http_req_waiting{test:X} scoped to a specific request, you must tag at the request level:
http.get(url, { tags: { test: 'my-endpoint' } });Scenario-level tags (set in options.scenarios[*].tags) propagate to all requests in that scenario but still require a matching threshold to surface as a sub-metric in the summary. Request-level tags are more precise and are the correct unit for per-endpoint breakdown.
Exposing sub-metrics in handleSummary
In the open-source k6 CLI, sub-metrics are only computed and included in the handleSummary data object when a threshold is defined for that tag combination. Without a threshold, k6 doesn’t materialise the sub-metric at all — it won’t appear in the summary even if requests were tagged correctly.
Use a dummy threshold that always passes to force the sub-metric to appear without enforcing any pass/fail condition:
export const options = {
thresholds: {
'http_req_waiting{test:my-endpoint}': ['p(99)>=0'], // always true, never fails
},
};Reading metrics
The request lifecycle in order: blocked → connecting → tls_handshaking → sending → waiting → receiving
| Metric | Reliability | Notes |
|---|---|---|
http_req_waiting (TTFB) | High | Measures server processing time only — excludes TLS, TCP, and connection overhead, which are tracked separately |
http_req_tls_handshaking | Low (from laptop) | CPU-intensive crypto — inflated by client CPU under load; can show 3s p99 when real cost is 58ms |
http_req_blocked | Medium | Time waiting for a free TCP connection slot — high values mean connection pool exhaustion, not a server problem |
http_req_connecting | Medium | Actual TCP handshake duration — high values point to network latency or server backlog |
Tip
TTFB is the metric to optimise — compare
http_req_waitingagainst a singlecurlTTFB to confirm whether degradation is server-side or test-infrastructure noise.
High VU count at low RPS = slow server
If you see 70+ VUs accumulating while RPS stays at ~5, the server is slow — VUs are piling up waiting for responses. This is not a client-side issue.
Validating results
Establish a curl baseline before every test run:
curl -o /dev/null -s -w \
"dns: %{time_namelookup}s\ntls: %{time_appconnect}s\nttfb: %{time_starttransfer}s\ntotal: %{time_total}s\n" \
https://your-endpointCompare k6’s http_req_waiting p50 against this single-request TTFB. If they match, degradation is server-side.
Cross-reference with jaeger traces to identify which internal service is the bottleneck — k6 tells you the endpoint is slow; Jaeger tells you which downstream call is to blame.
EOF errors
EOF errors (reported as request failed: EOF or status code 0) have several distinct causes. The pattern of when they occur is the main diagnostic signal:
| Pattern | Likely cause | How to confirm |
|---|---|---|
| All VUs at the exact same second | Infrastructure restart — pod, load balancer, or cache flushed existing connections | Check k9s for pod restart count spike at that timestamp |
| Scattered throughout the test | Server overloaded — TCP backlog full, server actively closing connections | Cross-reference with server CPU/memory; check if error rate tracks VU ramp |
| Only at stage transitions | Stale keep-alive connection race (see Connection reuse) | Set noConnectionReuse: true and rerun |
| Only during ramp-up | k6 hitting OS file descriptor limits — can’t open new TCP sockets | Check with ulimit -n; see k6-os-tuning |
Note
A k8s rolling deployment mid-test produces the same pattern as an infrastructure restart — all connections drop at once as the old pod is terminated. Avoid deploying during a test run.
See also
- k6-large-tests — capacity, OS tuning, script optimisations, distributed execution
- k6-vu-saturation — load generator as the bottleneck
- jaeger — tracing for identifying internal bottlenecks