

Taveesak Saengthong
Q: Why is business continuity planning critical in todayˇ¦s business context?
A: Fast and continuous access to information is critical in todayˇ¦s non-stop business environment where just-in-time delivery, global supply chains and round-the-clock customer service are the order of the day. When the systems that store and process this information are disrupted, the consequences can be serious.
The Meta Group estimates lost revenue from downtime at an average of US$1 million an hour, while Contingency Planning Research says losses go as high as US$6.45 million an hour for retail brokerages.
In recent years, the issue of data protection has also taken on a legal dimension as more and more regulations are enacted to ensure information integrity. United States regulations on data protection now apply to healthcare (HIPAA), financial services (SEC 17a-4), corporate accountability (Sarbanes-Oxley Act), life sciences (21 CFR Part 11), and government (DoD 5015.2-STD). These impact not only businesses in the US, but also the vast majority of companies around the world that have dealings with the worldˇ¦s largest economy.
Juxtaposed against these business imperatives is the reality that system disruption cannot be totally eliminated. Its causes are varied: they range from human error, system malfunction and hacking incidents to disasters such as the 9/11 attacks and the devastating Asian tsunami of 2004.
Less frequently mentioned, but no less of a concern in todayˇ¦s 24x7 business, is the need to manage the planned downtime required for routine maintenance activities.
The challenge for businesses, therefore, is to ensure that they are able to restart operations as quickly as possible after a disruption. In recent years, these concerns have pushed business continuity planning to the top of the corporate agenda.
Q: How do you develop a business continuity strategy?
A: Business continuity planning is about developing and applying procedures and technologies to ensure that critical information systems remain available or can be brought back into service quickly when unplanned and planned outages occur.
The development of a business continuity strategy starts with an assessment of risks and the probability of their occurring in an enterprise.
According to David Honour, editor of Continuity Central, four things need to be considered when evaluating the threats to a business:
„X The causes of the risk
„X The likelihood that it will occur
„X The potential hazard or impact if the risk does occur
„X The length of time a risk event may affect the business
Once there is a clearer picture of the risks involved, a business impact analysis will help to determine which applications require the most protection based on value of the data and the business impact of downtime. The degree of resiliency required can then be determined based on the recovery-time objective (RTO) and the recovery-point objective (RPO).
RTO defines the time frame in which specific business operations must be restored, while RPO defines the point in time to which the data has to be restored in order to resume processing transactions. The specific RTO and RPO determine which data replication and recovery option a business needs and how much it will cost.
Q: What are some of the business continuity options available? What should you consider when evaluating these options?
A: Business continuity solutions usually combine on on-site solutions with implementations that replicate data to a remote site.
On-site solutions allow non-disruptive backup, eliminate single points of failure and provide automation for business continuity procedures to ease the process of replication and backup and to speed data restoration when such a task is required.
For replication to a remote site, the traditional approach has been to save data to tape and send the tapes off for safe storage in a remote vault.
However, as IDC analysts Richard Villars and Bill North point out in their white paper Business continuity: More than just waiting for disaster, ˇ§In todayˇ¦s increasingly complex IT environment, where multiple applications and data sets are interdependent, data synchronization creates a new challenge for tape-based disaster recovery strategies. IT managers must be able to create simultaneous point-in-time copies for multiple data sets, thereby creating I/O-consistent copies.ˇ¨
One way to overcome the limitations of tape is to implement local mirroring to eliminate the backup window and lessen the backupˇ¦s impact on server performance, and then replicate the backup data to an offsite facility.
Data replication can be asynchronous or synchronous. Synchronous replication ensures that a remote copy of the data identical to the primary copy is created at the same time the primary copy is updated. An incomplete operation is rolled back at both locations, ensuring that the remote copy is always an exact mirror image of the primary.
One benefit of synchronous replication is that data can be recovered quickly. Operations at the remote, mirrored site can begin immediately where the primary site stopped, should operations at the primary site be disrupted. However, there is a distance limitation due to latency: each storage operation has to be confirmed before the system proceeds with the next. Over longer distances, say, more than 35 km (20 miles), the resulting propagation delay may not be acceptable for busy systems.
Asynchronous replication provides a way to mirror sites across any distance, and usually involves IP networks. With asynchronous replication, the application writes the data to primary storage and continues with the next operation. This is disconnected from the remote write operation and unlike synchronous replication, no confirmation step is required.
The benefit of asynchronous replication, therefore, is that it can span any distance without impact from propagation delay. However, the slight time lag between data being stored at the primary and remote sites can result in an I/O inconsistency between the two locations and possible loss of data.
A third option is a three data centre (3DC) model. Synchronous replication is employed between the primary data centre and a nearby hot site, and then the data is replicated to a geographically remote site using asynchronous replication.
A 3DC configuration provides the best combination of protection against disaster with minimal data loss and downtime, but at a significantly higher cost.
The choice is a complex one but ultimately, adopting a sound business continuity solution will produce clear business benefits.
Q: What are the benefits of implementing a business continuity solution?
A: Valuable information assets can be protected from site disasters, viruses, human error, and system failures by providing alternate remote and local copies of data and enabling rapid restart and recovery.
Today, advanced software tools simplify disaster recovery by minimizing the labour intensive nature of backup and recovery procedures and eliminating inefficient and resource intensive tape-based replication/retrieval processes.
The technology allows for rapid restart and recovery to be achieved over long distances with minimal impact on application response time. Non-disruptive point-in-time copy capabilities in software, for example, allow business applications to continue reading and updating data while a replica of that data is split off and used for other purposes. This means data can be accessed for testing, data mining and other operations without any impact on production systems.
The remote replication site can also take over primary operations on a planned basis to allow for periodic maintenance of the primary systems.
Within the wider IT context, the consolidation of resources and the standardization of IT practices on fewer storage platforms as part of a business continuity solution reduces the burden on IT staff and paves the way for greater efficiencies in storage utilization.
ˇ§Rather than backing up server-attached storage from a multitude of departmental email servers, an approach that requires widely dispersed tape backup systems and administrative support, IT storage mangers can leverage a SAN (storage area network),ˇ¨ say Villars and North.
They note that current efforts to consolidate servers and storage that are often widely dispersed and managed in uncoordinated ways set the stage for simpler and better data protection schemes.
Ultimately, businesses have to weigh cost considerations against a careful assessment of risk and a clear understanding of recovery objectives in order to arrive at the business continuity solution that is best for them.