|  Login
 * IT Interests * Disaster Recovery Planning Sunday, September 05, 2010
Contact Info.

Karthik Kumaraguru
Cincinnati, OH 45244


Call: +1 (513) 428-9428


Email: admin[ AT] kkarthik [DOT] info

Follow on: Twitter


Disaster Recovery Planning

In order to manage our internal IT governance, last week I went about finding information about different things, some of which include the following: How extensive are the disaster recovery plans of different organizations? In what time frame do they aim to resurrect all critical systems? What are their ideas and approaches? etc. In this regard, I spoke with many people, corresponded through emails/phone calls with many senior executives and managers among my friends and made the following observations.


Most organizations have some disaster recovery or business continuity plan in place. However, the detail and depth of such plans depended on the size and the industry of the organization. Almost all sources pointed to the importance of managing disaster recovery/business continuity for mission critical infrastructure. While manufacturing companies see their manufacturing infrastructure as critical, most service sector organizations consider their data-center locations and IT infrastructure as mission-critical.


Larger IT infrastructure that companies have had for some time has better DR plans than newer and updated infrastructure. This clearly indicates that many organizations don’t have a continuous DR plan update policy and they do it in bursts. For example a vice-president from large insurance company said, “We have fairly comprehensive plans for our mainframe, however, next to nothing for our Open Systems environment, which is continuing to grow. We haven’t established a timeframe for recovery because our business units are unable to prioritize their need for systems. Worst case, we believe we can recover everything in a month, and to date, the business has stated that should be fine, however, there are some things – e-mail, electronic document management system, claims system, etc. that would be required within 24-48 hours.”


This observation is further validated when I see that even the most publicized disaster recovery plans that are displayed on organizations’ (some intranet, some internet) web sites were a bit outdated. Every one of them was dated 2002 and 2001.



Many statistics point to the importance of disaster recovery planning. According to different studies:



  • 0.2% of all data-centers face some disaster every year.
  • Downtime costs the average U.S. business $210,000 per hour.
  • U.S. businesses with more than 1,000 employees lose about 2% of their annual revenue to network downtime.
  • Up to 50% of all system failures can be attributed to environmental or physical failures.
  • 93% of companies who lose data center access for 10 days file bankruptcy within a year. Half of that 93% file immediately


The timeframe in which mission critical processes would be resurrected really depends on what organizations perceive as important and urgent. In other words, these answers were business-specific.


A senior project leader from a large regional bank says, “[In my company,] different systems have different back up and recovery processes and time frames based on their relevance to core business, ranging from 24 hours to 2 weeks.”



IT centric organizations have robust redundancy, business continuity and disaster recovery plans. One senior program manager from a large IT solutions company says, “We host all our systems in a Class-A data center with independent, redundant power, I/C, environmental control. All devices have redundant clustered backups with auto-fail over. The data center also includes power conditioning, battery backup and generator backup for longer outages. These data centers have survived 9-11, the east coast power outage (not one server went down) and other “disasters”. We have a team of folks on 24-hour call to address any other issues, and all devices are under their respective manufacturers’ extended maintenance options.”



To determine what needs to be recovered in what time frame, almost all DR preachers say that a business impact analysis needs to be conducted. For example, Rich Schiesser from HarrisKern says, “Even the most thorough of disaster recovery plans will not be able to cost justify the expense of including every business process and application in the recovery. An inventory and prioritization of critical business processes should be taken representing the entire company. Processes that need to resume within 24 hours preventing serious business impact, such as loss of revenue or major impact to customers are rated A as a priority. Those processes that need to resume within 72 hours are rated B, and greater than 72 hours are rated C. These identifications and prioritizations will be used to propose business continuity strategies.” - Rich Schiesser, “Disaster Recovery”, HarrisKern



Even the DR planning projects in which I was involved, we always had a “Business recovery functions and needs analysis” in the beginning of the development of the plan. In that process, we determined all our daily functions, identified the resources needed for those functions, prioritized the recovery of functions and thus the recovery of resources.



Here is the step-by-step of a sample DR plan strategy.




  1. Understand Organizational Mission
  2. Understand Disaster

    1. Environmental(Physical)
    2. Equipment
    3. People
    4. Social
    5. Market place

  3. Understand the need to recover

    1. People/Skills
    2. Data/Information
    3. Assets/Materials

  4. Determine all your roles
  5. Determine what functions fall under each role
  6. Determine what resources are needed to accomplish those Functions

    1. Knowledge/People?
    2. Other Functions?
    3. Hardware/Communication?
    4. Software/Data?
    5. Services?
    6. Equipment/Tools?
    7. Environmental?
    8. Administrative?

  7. Determine Recovery Lead Team and Execution team for each Function group (Have backup teams for everything)
  8. Document all the following

    1. Recovery teams,
    2. Recovery backup teams,
    3. contact info,
    4. Vendors
    5. Suppliers
    6. Alternate headquarters

In the event of disaster:


----------------------------------


For each of the following time slots:




    1. 0-24 hours
    2. 2-3 days
    3. 4-7 days
    4. 8-15 days
    5. 16-30 days
    6. 2-3 months
    7. 4-6 months
    8. Never


  1. List the priority of re-build of functions in a time slot
  2. List the resources needed to accomplish those functions
  3. List vendors who could supply those items in that time frame

i. List contact names, numbers, account numbers etc..


ii. (Establish relationship with them so they would rise to that emergency and time-frame)




  1. Prepare the execution plan
  2. Prepare Task List/Check list for Recovery teams.
  3. Prepare a Disaster Recovery Backup/Maintenance/Knowledge management plan and religiously execute the plan.
  4. Review/Update these plan every 3 months.
  5. Always document incidents and responses.

Remember:


You can never foresee all types of disasters, as they are infinite and unique. But a firm plan will definitely help to overcome any disaster with considerable ease.




Home  |  Project/Portfolio Mgmt.  |  IT Interests  |  Leadership & Ethics  |  Robotics  |  Personal
All content strictly copyrighted. Please ask before you copy/use the content on these pages.
Portal engine source code is copyright © 2002-2010 by DotNetNuke. All Rights Reserved