Disaster Recovery
The policies, tools, and procedures for restoring IT systems and data following a disruptive event such as a cyberattack, hardware failure, or natural disaster.
What Is Disaster Recovery?
Disaster recovery (DR) is the IT-focused subset of business continuity — the processes and technologies used to restore systems, applications, and data after a disruptive event. A disaster recovery plan (DRP) defines how and how quickly an organisation can restore normal IT operations after ransomware, hardware failure, data centre outage, or another major incident.
Key DR Metrics
Recovery Time Objective (RTO): The maximum acceptable time between a disaster and the restoration of normal operations. Defines how long you can afford to be down.
Recovery Point Objective (RPO): The maximum acceptable amount of data loss, measured in time. If your RPO is 4 hours, backups must capture data at least every 4 hours.
These objectives drive every DR decision — the more aggressive your RTO and RPO, the more investment in redundancy and backup infrastructure required.
Disaster Recovery Strategies
Backup and restore: Restore from backups. Simplest and cheapest, but longest RTO. Suitable for non-critical systems.
Pilot light: Keep a minimal version of your environment running in the cloud, ready to scale up when needed.
Warm standby: A scaled-down version of your environment running continuously, ready to take over quickly.
Hot standby / Active-active: A fully operational parallel environment. Instant failover. Highest cost and complexity, appropriate for mission-critical systems.
The 3-2-1 Backup Rule
The foundational backup strategy for DR:
- 3 copies of your data
- 2 different storage media types
- 1 offsite or cloud copy
The offsite copy is critical for ransomware recovery — attackers frequently target network-connected backup systems.
Testing Your DR Plan
An untested DR plan is a hypothesis. Test regularly:
- Tabletop exercise: Walk through a scenario verbally with key staff
- Backup restoration test: Restore from backup to verify data integrity
- Full failover test: Actually fail over to your DR environment
Most SMBs discover their backups are incomplete or restoration takes far longer than expected the first time they test. Test before you need it.