Managing risk in the world of Cloud computing

December 6, 2010 Off By Hoofer
Grazed from ComputerWorld.  Author:  Kevin T. McDonald.

In embarking on a Cloud Computing project, it is important to assess the risks, come up with strategies to mitigate the risks and communicate any that aren’t sufficiently covered.

In practical terms this involves a mix of project management and business continuity best practices to arrive at a) the overall risk of the project, and b) what can be done to mitigate or lower the risk to acceptable levels.

Risk assessments take into account fire, flood and other intentional and unintentional disruptions caused by people. These are multiple pathways that can disrupt the people, processes and technology that drive an organisation’s effectiveness.

Other outside dependencies like power and light, gas and water, postal services, inbound and outbound logistics (shipping), data and telecommunications are all likely to be providing inputs and managing outputs independent of your control and oversight.

Organisational impact: what would happen if?

What would happen to the organisation, customers, brand and staff if this scenario power was cut for the local area for an extended period?  What would happen if diesel or natural gas delivery was disrupted?  Are the generators capable of running on multiple fuels?  What is the minimum workspace required to perform the most essential tasks?  If the workspace was not available, what would happen?

Mitigation strategies: what can we do to lessen impact?

Mitigation strategies introduce stacking-the-deck strategies to minimise the impact of events. If you depend upon the Internet to communicate, installing a satellite link and/or a cellular data link as a backup might prevent an outage. If voice is critical, alternatives such as voice over IP or cell over IP can provide communications in a crisis.

Continuity plans: keep going if the worst happens

A continuity plan is simply the formalisation of the steps you must take to continue operations in the face of a disruption. Once the ideas start flowing on how to keep things from occurring, management will buy in to the alternate strategies and fund practicing drills for the reaction strategies that kick in after the event has occurred.

Testing the continuity plan

Most organisations start testing by requesting comments on the written plan, then move to a structured walkthrough or a group edit, then up to a tabletop exercise.

In a tabletop exercise, the players represent a particular business role such as the accounting manager or IT manager. The exercise referee announces the type of disruption. The players then walk through in a timed round what they would do about the disruption.

The exercise helps ensure that the plan has no gaps in coverage and the staff understand the plan well enough to execute under the pressure of real events. These are exercises. There is no right or wrong answer.

Figure 11: Business continuity planning

The only wrong answer is answering no to the question “Did we test this?” when we actually have to undergo the real thing. This level of planning helps identify areas of risk and also contributes to formally devising plans to reduce the risk to an acceptable level.