Is 99.999% reliability good enough?

So much to blog ….Entry for April 19, 2008

By G C Network | May 18, 2008

When I started this yesterday, I had a list of about five things I wanted to say on this blog. I then decided on a strategy to list topics as…

Hello World ! – May 18, 2008

By G C Network | May 18, 2008

I’ve been toying with the idea of doing a blog for about six months now. Initially I didn’t see how any of my contributions to the blogosphere would matter to…

According to Reuven Cohen in his recent post, Cloud Failure: The Myth of Nines , the whole concept of reliability may be meaningless.

“In the case of a physical failure such as Flexiscales recent one, the hardware downtime might be small, but the time to restore from a backup might be considerably longer. A minor cloud failure could cause a cascading series of software failures causing further application outage of hours or even days for those who depended on the availability of the given cloud. Meaning your cloud may achive five nines, but your application hosted on it doesn’t.”

I agree. When dealing with a system of systems, like the cloud, component and function SLA’s are meaningless. The cloud architect must brush up on their Bayesian probability theory, plan for failure and ensure that no matter what happens, the users can complete whatever workflow is requested.

“One of the major benefits to using cloud computing is that you can make these types of failover assumptions well before they happen using an emerging global toolset of cloud components. It’s not a matter of if, but a matter of when, when you take into consideration that application components will fail then you can build an application that features “failure as service”. One that is always available, one with Zero Nines. “

Follow me at https://Twitter.com/Kevin_Jackson

G C Network