Disaster Recovery

Go back to Tutorial

Disaster recovery is part of business continuity, and deals with the immediate impact of an event. Recovering from a server outage, security breach, or hurricane all fall into this category. Disaster recovery usually has several discreet steps in the planning stages, though those steps blur quickly during implementation because the situation during a crisis is almost never exactly to plan. Disaster recovery involves stopping the effects of the disaster as quickly as possible and addressing the immediate aftermath. This might include shutting down systems that have been breached, evaluating which systems are impacted by a flood or earthquake, and determining the best way to proceed. Where to set up temporary systems, how to procure replacement systems or parts, how to set up security in a new location—all are questions that relate both to disaster recovery and business continuity, but which are primarily focused on continuing business operations.

Disaster recovery (DR) involves a set of policies and procedures to enable the recovery or continuation of vital technology infrastructure and systems following a natural or human-induced disaster. Disaster recovery focuses on the IT or technology systems supporting critical business functions, as opposed to business continuity, which involves keeping all essential aspects of a business functioning despite significant disruptive events. Disaster recovery is therefore a subset of business continuity.

From the IT perspective, recovery will usually mean establishing support for the processing and communications functions considered critical by the business community, and then establishing support for ancillary systems. From the business perspective, recovery will mean being able to execute the business functions that are at the core of the business, and then being able to execute ancillary functions.

Key Factor

Is the goal to have all systems up and running within a day? A week? Is it enough to only bring up a few key systems within the first week, while taking longer to restore others? This factor is often expressed as either the “Recovery Time Objective” (RTO) or the “Service Delivery Objective” (SDO). This refers to the amount of time that can elapse from the failure to the time when the systems or services are available for use. Most of ten this factor can vary by system or service; for example, a company’s order processing system may have a SDO of 24 hours, while the company’s intranet has an SDO of 1 week. What is the difference between RTO and SDO? It is possible to recover one or more “systems” in a short period of time, but due to an unplanned dependency (often unknown until testing the DRP) or recovery failure it is impossible to provide the full and necessary functionality to restore service. Hence, while having a good RTO is important, understanding SDO and planning around that goal is even more important. A single failure of a key component that other systems are dependent on can result in failures of many other systems. Because of this it is important to understand the dependencies within your environment.

How much data can you afford lose? This is expressed as the “Recovery Point Objective” or RPO. Depending on the environment, the loss of any data could have a significant impact. A rule of thumb is that the lower the RPO, the higher the overall cost of maintaining the environment for recovery.

Disaster Recovery Plan

A disaster recovery plan (DRP) is a documented process or set of procedures to recover and protect a business IT infrastructure in the event of a disaster. Such a plan, ordinarily documented in written form, specifies procedures an organization is to follow in the event of a disaster. It is “a comprehensive statement of consistent actions to be taken before, during and after a disaster.” The disaster could be natural, environmental or man-made. Man-made disasters could be intentional (for example, an act of a terrorist) or unintentional (that is, accidental, such as the breakage of a man-made dam).

Given organizations’ increasing dependency on information technology to run their operations, a disaster recovery plan, sometimes erroneously called a Continuity of Operations Plan (COOP), is increasingly associated with the recovery of information technology data, assets, and facilities.

Go back to Tutorial

Get industry recognized certification – Contact us

Menu