Before running

Run from a remote machine, not your laptop. Client CPU contention inflates http_req_tls_handshaking — can show 3s p99 when actual TLS cost is 58ms.

Stabilise the environment first. Check k9s for pods with high restart counts before starting. A pod restarting mid-test makes results uninterpretable.

Connection reuse

export const options = {
  noConnectionReuse: true,
};

By default, k6 reuses HTTP keep-alive connections across iterations within the same VU. This creates a race condition at stage boundaries:

  1. The server decides an idle connection has timed out and sends a TCP FIN to close it.
  2. While that FIN is still in-flight (or sitting unprocessed in the OS receive buffer), the VU sends a new request on the same socket.
  3. The server receives the request on an already-closed socket and responds with TCP RST.
  4. k6 sees a “connection reset by peer” or EOF — which looks like a server error but is just a connection lifecycle mismatch.

The root cause is a timeout mismatch: as long as the client timeout ≥ server timeout, the server can close first and trigger this race. Setting noConnectionReuse: true avoids it by opening a fresh connection per iteration, at the cost of slightly higher per-request overhead (TCP + TLS handshake).

TLS in UAT

export const options = {
  insecureSkipTLSVerify: true,
};

UAT environments might have inconsistent certs across pods. Without this, you get spurious TLS errors that look like server failures.

Tagging for per-endpoint metrics

k6 automatically applies a scenario system tag to every metric, but scenario tags don’t create per-endpoint sub-metric breakdowns — they just let you filter by scenario after the fact. To get http_req_waiting{test:X} scoped to a specific request, you must tag at the request level:

http.get(url, { tags: { test: 'my-endpoint' } });

Scenario-level tags (set in options.scenarios[*].tags) propagate to all requests in that scenario but still require a matching threshold to surface as a sub-metric in the summary. Request-level tags are more precise and are the correct unit for per-endpoint breakdown.

Exposing sub-metrics in handleSummary

In the open-source k6 CLI, sub-metrics are only computed and included in the handleSummary data object when a threshold is defined for that tag combination. Without a threshold, k6 doesn’t materialise the sub-metric at all — it won’t appear in the summary even if requests were tagged correctly.

Use a dummy threshold that always passes to force the sub-metric to appear without enforcing any pass/fail condition:

export const options = {
  thresholds: {
    'http_req_waiting{test:my-endpoint}': ['p(99)>=0'], // always true, never fails
  },
};

Reading metrics

The request lifecycle in order: blocked → connecting → tls_handshaking → sending → waiting → receiving

MetricReliabilityNotes
http_req_waiting (TTFB)HighMeasures server processing time only — excludes TLS, TCP, and connection overhead, which are tracked separately
http_req_tls_handshakingLow (from laptop)CPU-intensive crypto — inflated by client CPU under load; can show 3s p99 when real cost is 58ms
http_req_blockedMediumTime waiting for a free TCP connection slot — high values mean connection pool exhaustion, not a server problem
http_req_connectingMediumActual TCP handshake duration — high values point to network latency or server backlog

Tip

TTFB is the metric to optimise — compare http_req_waiting against a single curl TTFB to confirm whether degradation is server-side or test-infrastructure noise.

High VU count at low RPS = slow server

If you see 70+ VUs accumulating while RPS stays at ~5, the server is slow — VUs are piling up waiting for responses. This is not a client-side issue.

Validating results

Establish a curl baseline before every test run:

curl -o /dev/null -s -w \
  "dns: %{time_namelookup}s\ntls: %{time_appconnect}s\nttfb: %{time_starttransfer}s\ntotal: %{time_total}s\n" \
  https://your-endpoint

Compare k6’s http_req_waiting p50 against this single-request TTFB. If they match, degradation is server-side.

Cross-reference with jaeger traces to identify which internal service is the bottleneck — k6 tells you the endpoint is slow; Jaeger tells you which downstream call is to blame.

EOF errors

EOF errors (reported as request failed: EOF or status code 0) have several distinct causes. The pattern of when they occur is the main diagnostic signal:

PatternLikely causeHow to confirm
All VUs at the exact same secondInfrastructure restart — pod, load balancer, or cache flushed existing connectionsCheck k9s for pod restart count spike at that timestamp
Scattered throughout the testServer overloaded — TCP backlog full, server actively closing connectionsCross-reference with server CPU/memory; check if error rate tracks VU ramp
Only at stage transitionsStale keep-alive connection race (see Connection reuse)Set noConnectionReuse: true and rerun
Only during ramp-upk6 hitting OS file descriptor limits — can’t open new TCP socketsCheck with ulimit -n; see k6-os-tuning

Note

A k8s rolling deployment mid-test produces the same pattern as an infrastructure restart — all connections drop at once as the old pod is terminated. Avoid deploying during a test run.

See also

  • k6-large-tests — capacity, OS tuning, script optimisations, distributed execution
  • k6-vu-saturation — load generator as the bottleneck
  • jaeger — tracing for identifying internal bottlenecks