Disaster Recovery Plan (DRP) in Business

Fire, flood, earthquake and accidental deletion of data are all acts that can cause disastrous consequences on data. Such disasters can prevent the network from operating normally, which in turn can hamper the organization’s business. These disasters can be classified into man-made disasters and environmental disasters. Man-made disasters are intentionally or unintentionally caused by humans. For example, a user accidentally deletes the data, virus and malicious programs can damage data and various other events can cause data loss and downtime. Environmental disasters are non-preventive but can be reduced if appropriate precautions are taken. Environmental disasters include fire, flood, earthquake, tornado and hurricane.

Disaster recovery deals with recovery of data that is damaged due to destructive activities. The time required to recover from a disaster depends on the disaster recovery plan implemented by the organization. A good disaster recovery plan can prevent an organization from any type of disruption.

Disaster Recovery Plan/Business Continuity Plan

A Disaster Recovery Plan (DRP) helps to identify threats to an existing business such as terrorism, fire, earthquake and flood. It also provides guidance on how to deal with occurrence of such events. Disasters are unpredictable; hence, planning for the worst is important for any business. A DRP is also called a Business Continuity Plan (BCP). The only difference between Disaster Recovery Plan and Business Continuity Plan is the focus. The focus of Business Continuity Plan is to provide continuity of operations in the organisation. Whereas, Disaster Recovery Plan focuses on recovery and rebuilding of the organisation after a disaster has occurred. It includes the steps that are necessary to recover from a disaster.

The first step to create a Disaster Recovery Plan is to identify all functions of the organization. Each of the functions can fall in any of the following categories based on their criticality and importance:

Critical – This function is most important for business operations.
Necessary – This function is required, but the organization can manage without it for a short period.
Desired – This function is not required but will enhance the organization’s ability to conduct its mission efficiently.
Optional – This function does not affect the operation of the organization even if it is absent.

The next step is to determine the following information for each of the identified functions:

Under which category of organizational functions does the function fall?
Who is liable for operation of the function?
What does an individual need to perform the function?
When should the function be accomplished relative to other functions?
Where will the function be performed?
How is the function performed?

The document containing all the relevant information about the organizations critical functions is called Business Impact Assessment (BIA). This plan has to be approved by the organization’s management. Disaster Recovery Plan includes the processes and procedures required to restore organization’s data so that it is functioning again and ensures continues of its operation.

Data Backup

Backing up data is an essential disaster recovery method that must be implemented in any organization. Performing regular backups and testing them by performing regular restores is the basic requirement for a good recovery plan. If critical data is lost or corrupted due to a disaster or a failure in hardware and storage media, data cannot be recovered unless backup of the data is available. Hence, backup is an essential part of BCP. In addition, backup of all the organization’s data is critical due to security failure as individual can gain access to important data. Backups include not only data but also application programs that processes the data, the operating system and utilities that the hardware platform requires to run the applications. The frequency of backing up all these items is different.

While performing a data backup the following points must be considered:

Who is responsible for data backup?
Where will be the backups stored?
How long should a backup last?

Types of Data Backups

The process of creating a backup copy of the data and the software’s takes into consideration the size of the resulting backup and the time required to perform the backup. Both these considerations will affect the frequency at which the backup is taken. Only after the administrator decides the information that requires to be backed up, the backup type can be determined.

There are three main types of backup as follows:

Full Backup – In a full backup, all data is backed up onto the storage media. It requires a large amount of time and space for creating and storing the data respectively. However, the restoration process is quite fast.
Incremental Backup – The incremental backup will back up the data that has changed since the last full or incremental backup. This indicates that the number of files to be backed up is very less as compared to the full backup. It reduces the backup time, but increases the time for restoration. During restoration, the system has to be restored from the last full backup, followed by each incremental backup.
Differential Backup – In differential backup, only the data that has changed since the last full backup is backed up. The time required to create a differential backup is much less than full backup. It does not mark the file whose backup has been taken. In this, restoration of data is faster as compared to incremental backup restoration because system uses both the latest full backup and the latest differential backup.

In case of a disaster, you can easily recover the data from the backup medias that are used to store the backup. For example, consider a full backup is taken on a Friday and an incremental backup is taken on Monday and Tuesday and the system crashes on Wednesday. Then, for incremental backup, it will require all three backup medias. They include Friday’s full backup and both the incremental backups for Monday and Tuesday.

Whereas, if a differential backup is taken on Monday and Tuesday, then to restore the data on Wednesday, only two medias are required. This includes the full backup taken on Friday and the last differential backup taken on Tuesday.

Rotation Schemes

It is important to keep at least one set of backup tape offsite, to prevent storage of medias at a centralized location. By rotating backups between a different set of tapes, data is not always being backed up to the same tapes, and a previous set is always available in another location.

The most common rotation scheme is Grandfather-Father-Son (GFS) rotation. The scheme uses three sets of tapes each for a monthly, weekly and daily backup. GFS tape rotation scheme stores and protects the data by using a minimum number of tapes by rotating them. A tape set of each week in a month is rotated back into service and reused.

In GFS backup schedule, a full backup is performed every week and incremental or differential backups are taken on daily basis. At the end of the week, the daily and weekly backups are stored offsite and new set of tapes are used for the next week. With GFS backup, at least one full backup is performed before starting any system. The system then proceeds as follows:

Daily backups such as incremental or differential backup are performed in the son tapes. Son tapes can be reused each week.
Weekly full backups are performed in the father tapes. A father tape set is required for each week except for the last week.
Monthly full backups are performed in the grandfather tape set. They cannot be reused and are stored offsite.

Storage of Backups

An important element to factor into the cost of the backup strategy is the expense of storing the backups. A simple backup strategy can store all backups together for quick and easy recovery actions.

If a disaster occurs then backups that were stored at the same location could be lost. Hence, to avoid this it is recommended to store the most recent copies locally and other old copies in separate locations.

Depending on the level of security desired and threats such as floods or fire, offsite data storage is another more recent solution. This offsite data is required to be processed at some location and this location is called backup site. This backup site is operated by the organization or contracted through a company that specializes in disaster recovery services.

There are three backup sites available as follows:

Cold Site – Cold site provides only the physical space for recovery operations and the organization that is using that space provides its own hardware and software.
Hot Site – Hot site is a fully operational offsite data processing facility and it is equipped with both systems’ hardware and software. After a disaster to the organization, it can relocate to the hot site with minimum downtime.
Warm Site – Warm site is compromise between hot site and cold site. It is partially equipped with system’s hardware and software, communication equipment and power supply.

Utilities

Services such as electricity, telephone, Internet communication and wireless service are essential to keep the business running. However, these services can be interrupted during a disaster. Since computers and network devices require power to operate, organizations must be equipped with Uninterruptible Power Supply (UPS). These devices provide power backup for a short period of time. For extended period of power failure, organizations must be equipped with a backup emergency generator. These generators contain fail-over switches that enable them to start automatically if the power supply in the organization fails.

High Availability and Fault Tolerance

High availability and fault tolerance are the measures to keep business operating in the event of a system failure. One of the objectives of security is the availability of data and processing of the data when an authorized user requires it. Data availability can be provided through Redundant Array of Independent Disks (RAID). RAID enables a server to continue operating without losing any data during a hard-disk failure. High availability is more than data redundancy; it requires that both data and services be available throughout. Clustering server also increases availability by ensuring that if a server becomes unavailable due to failure or planned downtime, another server in the cluster takes the workload.