Cloud Disaster Recovery (DR) combines cloud infrastructure capabilities with structured resilience planning to ensure business continuity when primary systems fail. Unlike traditional DR requiring duplicate physical data centers, cloud DR leverages geographic distribution, automated orchestration, and elastic scaling to protect workloads across regions. The core challenge lies in balancing recovery speed against operational cost—organizations must navigate trade-offs between infrastructure readiness (hot vs. cold sites), replication patterns (synchronous vs. asynchronous), and compliance requirements while maintaining acceptable Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO). Modern cloud DR extends beyond backup restoration to include failover automation, data consistency verification, and business impact analysis that prioritizes which systems recover first.
What This Cheat Sheet Covers
This topic spans 14 focused tables and 116 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Core Recovery Objectives
| Metric | Example | Description |
|---|---|---|
RTO = 2 hoursMax acceptable downtime | • Maximum tolerable duration a system can remain unavailable after a disaster before business impact becomes unacceptable • measures forward from the incident to when operations must resume. | |
RPO = 15 minutesLast backup: 8:45 AM | • Maximum acceptable data loss window measured in time • defines how far back you recover, representing the age of the most recent usable backup • measures backward from the incident to the last recovery point. | |
MTD = 72 hoursBeyond: permanent closure risk | • Absolute maximum time a business function can be unavailable before the organization faces irreversible consequences like insolvency or mission failure • MTD ≥ RTO always. | |
WRT = 30 minutesData restore + validation time | Duration required after system restoration to bring recovered data back to a consistent, usable state and verify integrity before resuming operations. |