Software Resilience Patterns Cheat Sheet

Updated 2026-03-18

Next Topic: SOLID Principles Cheat Sheet

Software resilience patterns are architectural strategies designed to build fault-tolerant, self-healing systems that continue functioning despite failures, network issues, or overload conditions. In distributed systems, where failures are inevitable rather than exceptional, resilience engineering shifts from preventing failures to designing systems that gracefully handle them. These patterns—from circuit breakers that prevent cascading failures to chaos engineering that deliberately injects faults—form the foundation of modern production systems at scale. Understanding not just what each pattern does but when and why to apply it transforms fragile systems into robust, production-ready architectures that survive the chaos of real-world operations.

What This Cheat Sheet Covers

This topic spans 15 focused tables and 112 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Circuit Breaker States & BehaviorTable 2: Retry & Backoff StrategiesTable 3: Isolation & Resource ManagementTable 4: Timeout & Fallback PatternsTable 5: Health Monitoring & ProbesTable 6: Distributed Transaction PatternsTable 7: Traffic Management & Load ControlTable 8: Rate Limiting AlgorithmsTable 9: Service Mesh & Proxy PatternsTable 10: Caching for ResilienceTable 11: Data Partitioning & ShardingTable 12: Failure Detection & PreventionTable 13: Dead Letter HandlingTable 14: Chaos EngineeringTable 15: Resilience Library Implementations

Table 1: Circuit Breaker States & Behavior

State	Example	Description
Closed (Normal)	`circuitBreaker.state = CLOSED` `request → downstream service`	• Requests flow normally to the downstream service • failure counter tracks errors against a threshold (e.g., 5 failures in 10 seconds) before opening.
Open (Failing)	`circuitBreaker.state = OPEN` `request → immediate FailFastException`	• All requests fail immediately without calling the service • protects downstream by preventing further load • transitions to half-open after a timeout period (e.g., 60 seconds).
Half-Open (Testing)	`circuitBreaker.state = HALF_OPEN` `limited test requests → service`	• Allows a limited number of test requests (e.g., 3) to check if the service recovered • success → transitions to closed • failure → transitions back to open.
Failure Threshold	`failureThreshold = 5` `errorPercentage = 50%`	• Trigger condition for opening the circuit • can be absolute count (5 failures) or percentage (50% error rate) within a sliding time window.

Table 1: Circuit Breaker States & Behavior

State	Example	Description
Closed (Normal)	`circuitBreaker.state = CLOSED` `request → downstream service`	• Requests flow normally to the downstream service • failure counter tracks errors against a threshold (e.g., 5 failures in 10 seconds) before opening.
Open (Failing)	`circuitBreaker.state = OPEN` `request → immediate FailFastException`	• All requests fail immediately without calling the service • protects downstream by preventing further load • transitions to half-open after a timeout period (e.g., 60 seconds).
Half-Open (Testing)	`circuitBreaker.state = HALF_OPEN` `limited test requests → service`	• Allows a limited number of test requests (e.g., 3) to check if the service recovered • success → transitions to closed • failure → transitions back to open.
Failure Threshold	`failureThreshold = 5` `errorPercentage = 50%`	• Trigger condition for opening the circuit • can be absolute count (5 failures) or percentage (50% error rate) within a sliding time window.