Container resource management is the practice of defining, allocating, and controlling compute resources (CPU, memory, storage) for containers running in orchestration platforms like Kubernetes and standalone runtimes like Docker. Proper resource management prevents noisy neighbor issues, ensures predictable performance through scheduling guarantees, and maximizes cluster utilization while avoiding out-of-memory kills or CPU throttling. At its core, resource management relies on two key primitives: requests (guaranteed allocations for scheduling decisions) and limits (hard caps enforced at runtime)—misaligning these causes either wasted resources or application instability. A critical mental model: Kubernetes schedules based on requests but enforces limits, meaning overcommit is common and understanding QoS classes (Guaranteed, Burstable, BestEffort) determines which pods survive resource pressure. For production workloads, always set requests equal to observed P50 usage and limits at P95 with headroom, monitor actual consumption continuously, and use autoscaling mechanisms (HPA, VPA, Cluster Autoscaler) to dynamically adapt to demand while maintaining cost efficiency.
What This Cheat Sheet Covers
This topic spans 28 focused tables and 172 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Resource Units and Specifications
| Unit | Example | Description |
|---|---|---|
500m = 0.5 CPU cores | • 1000 millicores equals one full CPU core • fractional requests allowed down to 1m granularity. | |
2 = 2 full CPU cores | • Represents one physical core or one virtual core depending on node type • equivalent to 2000m. | |
256Mi = 268,435,456 bytes | • Binary unit using base-2 (1 MiB = 1024² bytes) • standard for container memory limits in Kubernetes. | |
4Gi = 4,294,967,296 bytes | • Binary unit using base-2 (1 GiB = 1024³ bytes) • preferred over GB for consistency with OS-level memory accounting. |