## **1. Objective** Ensure business continuity and data protection by implementing an effective DR strategy for the customer, leveraging EFS replication, RDS PITR, and different failover methods. ## **2. DR Scenarios & Recovery Options** | | **Method** | **RDS Recovery** | **EFS Recovery** | **Failover Steps** | **Estimated Downtime (RTO)** | **RPO** | **Cost Impact** | | ------------------ | ------------------------------- | ---------------- | ---------------------------- | ------------------------------------------------------------------------------------------------- | ---------------------------- | ------- | --------------- | | DR Basic Service | **Cold Backup-Restore** | Snapshot (6h) | Backup Restore (6h) | 1. Restore RDS from snapshot (6h)
2. Restore EFS from snapshot (6h)
3. Recover EKS (4h) | **24 hours** | 4 hours | **Base Cost** | | DR Premium Service | **EFS Replica Only (RDS PITR)** | PITR (6h) | EFS Replica + Restore (0.2h) | 1. RDS recovery from PITR (6h)
2. Stop EFS sync (0.2h)
3. Full EKS recovery | **6 hours** | 15 min | **+30% Cost** | --- ## **3. Downtime Estimation & RTO Considerations** - **EFS Replica Only (RDS PITR)** - **6-hour RTO**, significantly reducing downtime compared to cold restore. - **15-minute RPO** ensures minimal data loss. --- ## **4. DR Execution Plan** ### **4.1 Pre-DR Readiness Checks** - Ensure **EFS replication** is active and functioning correctly. - Verify **RDS PITR backups** and retention policies. - Pre-configure **EKS deployment templates(Velero)** for rapid recovery. ### **4.2 Disaster Recovery Trigger** - DR activation is **initiated upon a major failure event** in the primary environment. - Decision criteria include **regional failure, prolonged service outage, or severe data corruption**. ### **4.3 Execution Steps** #### **EFS Replica Only (RDS PITR)** 1. **Recover RDS** from PITR (**6 hours**). 2. **Stop EFS replication sync** (**0.2 hours**). 3. **Recover EKS cluster** and validate application (**immediate**). ### **4.4 Post-Failover Validation** - Confirm **data consistency** between DR and primary environments. - Validate **application services and connectivity**. - Communicate DR activation and service restoration to stakeholders. --- ## **5. DR Testing & Cost Estimation** - **Annual DR validation test** is required, adding an **estimated 2 months of AWS costs**. - **EFS Replica Only (RDS PITR):** - **$20.8K/month**