The issue of disaster recovery is something that most enterprises are forced to face as a compliance, security and performance issue. And due to this operational requirement, a comprehensive disaster recovery plan is laid down. However, how many enterprises implement regular disaster recovery testing? Is it really safe to assume that a plan which was made potentially months ago, is still viable today?
Disaster recovery testing is an important operations procedure. One that needs to be approached with the same due diligence as developing the original DR plan. Below are some of the most compelling reasons for establishing a comprehensive disaster recovery testing regime.
Identifying Legacy Infrastructure
Operations teams spend much of their time keeping critical business infrastructure operating smoothly. Performance tweaks are made, hardware changes are executed, and software patches are applied as a matter of course. Any number of small fixes might have been implemented since the disaster recovery environment was configured.
There is a danger that the cumulative effect of applying multiple hardware/software fixes to the live environment may render the disaster recovery environment obsolete. These fixes are almost always applied to keep critical business applications running smoothly. If they are not mirrored in the DR environment, there is a danger that these application wills perform poorly or fail to perform altogether should a disaster occur.
Therefore, regular disaster recovery testing allows the business to test core applications upon what in effect, is legacy infrastructure. It gives the operations team a chance to highlight any problems, and apply the relevant fixes.
A Chance to Develop Expertise
Most disaster recovery plans will incorporate a wide range of processes, to be actioned by a potentially diverse team. While many of these processes many be tested individually by their individual owners, this does not prove the overall effectiveness of the end-to-end plan.
Only by undertaking a full disaster recovery test, can the true effectiveness of the plan be proven. This testing needs to start from the point that the management team would initiate the disaster recovery plan.
This gives all of the stakeholders involved in the end-to-end disaster recovery process a chance to develop expertise in how to execute it. Individual people or departments may be sure they have their disaster recovery process mastered. But how often have they tested these during a full emergency?
Application testing and QA teams do not test code commits against the disaster recovery environment. The DevOps cycle only tests the application itself, and its functionality. The application is only tested under the integration environment, before being pushed to live. Therefore, the application is never tested upon the actual environment it will be operating across, in the case of catastrophic disaster.
Managing Post-Disaster Changes
How many change management teams consider the effects of every business process change upon the disaster recovery environment? Probably very few, the overhead of doing so would potentially double the workload.
Regular disaster recovery testing makes it possible to highlight any problems created by operational process changes, should the disaster recovery plan require action.
Nobody wants to consider the effects of what would happen should when it is needed, the disaster recovery environment fails. The only way to ensure that the company has the best chance of surviving catastrophic technology failure, is to test the disaster recovery plan and environment regularly.