Requests vs limits
Requests — the amount the scheduler guarantees to the container. Used to decide which node to place the pod on. The container is always allocated at least this much.
Limits — the maximum the container is allowed to consume. Enforcement differs by resource type:
- CPU: throttled (slowed down, not killed)
- Memory: OOMKilled (pod is killed and restarted)
What happens when requests are omitted
When requests is not set, Kubernetes defaults it to equal limits. A pod with only limits defined still “reserves” the full limit amount from the scheduler’s perspective.
Bursting pattern
Set requests lower than limits to allow the pod to burst during spikes while reserving less capacity on the node:
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 400m
memory: 512MiThis is useful for workloads that are mostly idle but occasionally spike — the pod only “reserves” 100m CPU on the node but can use up to 400m when available.
Tip
QoS class — when requests equal limits, the pod gets
GuaranteedQoS (last to be evicted under pressure). When requests are lower than limits, it getsBurstable. When no limits or requests are set at all, it getsBestEffort(evicted first).
Warning
Memory OOMKill — unlike CPU throttling, exceeding the memory limit kills the pod immediately. Set memory limits conservatively and monitor actual usage before tightening them.