Data replication for disaster recovery
Today's businesses operate within a complex web of digital data, which must remain intact and accessible after a disaster if operations are to continue. Let's explore best practices for restoring your data when the worst happens.
Some may wax nostalgic for the old days when, in the event of a disaster, business owners simply had to grab their rolodex and run. Today's businesses operate within a complex web of digital data, which must remain intact and accessible after a disaster if operations are to continue.
That is where data replication comes in as a critical part of your larger IT disaster recovery efforts. By copying and moving applications and data from a database on one server to another located outside of the business (ideally in another geographical region) and typically in real-time or near real-time, a business can stay up and running in the event of everything from hurricanes to hackers.
What is data replication?
Data replication isn't as basic as copying and moving all of your data all of the time. First, replicating all of an organization's data is prohibitively expensive. Therefore, a key component of a data replication strategy is to make sure that essential applications, processes and data are highest on the priority list. For example, email, CRM and financial systems are core applications that typically can't be down for more than few hours.
Nor is the data replication itself the only part of a replication strategy to take seriously. In addition to identifying your must-have data, you need to know how you will access that data. When disasters occur, you might think, “It's replicated. No problem!” But how will you get to that data in a failover state? And prior to that, how will you ensure that your strategy will work when you need it?
Do you need a MPLS circuit at your DR site, or VPN tunnels? Larger organizations typically have a private network that connects their offices around the country. If they use that network to talk to their server infrastructure in the production data center, the disaster recovery data center will need to be connected to that same MPLS network for their users to communicate to it. If an organization needs a more cost-effective way of talking to the DR data center, VPN tunnels over the Internet are another option.
Best practices for data replication and recovery
Here are three important best practices to keep in mind to make sure your data gets back up and running efficiently and effectively:
1. Tier data sets in order of importance
Placing your applications in the correct order of importance helps to optimize your budget. Due to high costs, data replication is typically reserved only for essential applications and processes. Defining recovery time objectives and recovery point objectives is essential - you need to know how both long you can go without your most critical applications and how much data loss can your business can handle.
For example, you need to have your Exchange email server running all the time, even during a disaster. So it should be Tier 0, or a mission-critical application. Tier 1 applications might include your billing or order entry system, so you can take orders even when your production environment is down.
You also need make sure everyone understands where these files, in all tiers, exist in order to be backed up. Are they in database files that need to go on a server? Are they properly set up?
2. Determine the optimal sequence for bringing resources back up
In the event of a disaster, replicated data must be brought back up in a carefully determined sequence and pace. Certain applications are dependent on others to start. If you replicated 50 different servers, you can't simply start them all up at once.
Instead, bringing resources back up in a cloud recovery scenario is almost like a dance — a slow waltz is a good example — as applications come back up step by step. For instance, domain controllers have to come up first. Authentication has to be done early. Then an Exchange email server that houses email might come up next. Finally, ancillary systems can get in line.
3. Test your data replication and DR plan
An often-ignored aspect of DR is testing your plan. So, if your strategy includes replicating entire VMs, you should test data center disaster recovery plan to make sure you have addressed infrastructure changes that may have occurred throughout the year. You could have 50 systems and replicate 20 of them, but when you fire everything up after declaring a disaster, what if you forgot to replicate one core system that all of the other 20 systems depend on?
For guaranteed data replication success, you need to go beyond testing the validity of the data. Test the order of operations to ensure all systems communicate properly. Access the replicated files often to make sure they are not corrupted.
A cloud recovery solution such as Flexential Disaster Recovery as a Service (DRaaS) can help you identify your Recovery Time Objective and Recovery Point Objective, as well as make sure all of your data replication and DR plans are properly and regularly tested.