Outages

Cloud backup could have prevented Delta's meltdown

Grazed from InfoWorld. Author: David Linthicum.

I hope you weren't flying Delta this week. If you were, you know that Delta's systems experienced an outage, and of course passengers bore the brunt of it in hundreds of canceled and delayed flights. When Delta performs a postmortem on this outage, it will likely find that the outage was caused by a common occurrence: network failure.

However, Delta could not recover or switch to backup systems. As my InfoWorld colleague Andrew C. Oliver wrote, Delta neglected the four pillars of high availability. Delta's CEO admitted as much, telling the Wall Street Journal that "it's not clear the priorities in our investment have been in the right place...

AWS Sydney's outage shows the value of a walk in the cloud

Grazed from The Register. Author: Simon Sharwood.

To understand the lessons of this week's Amazon Web Services outage in Sydney, which took down the local AWS cloud for a few hours, take a walk down Huntley Street, Alexandria, an unlovely street in a light industrial suburb. Huntley Street is interesting because its footpaths are riddled with an unusual concentration of telecoms duct covers that hide the wires and fibres bringing data into the Equinix data centre around the corner.

Another feature of Huntley Street is a bridge, currently under repairs, over a stormwater drain that feeds into a waterway called the Alexandria Canal. The Canal is at sea level, so Huntley Street is maybe a meter or two above the water. It's almost certainly below the hundred-year flood line. When Equinix opened the data centre it knew about the canal and the hundred-year line, because it told us servers there are all five metres off the ground so they don't get wet in a really big wet...

Amazon Web Services storm outages serve as a warning of cloud risk to businesses

Grazed from AFR. Author: Paul Smith.

Australian businesses have been warned they need to spread the risk in their cloud computing operations across different regions after the Sydney storms on Sunday knocked out the operations of numerous Amazon Web Services customers. The ferocious storms that hit NSW, left AWS clients including Domino's Pizza, Foxtel, The Iconic, Stan and Domain without websites or key systems for hours.

It served as a warning that sending systems to the cloud, rather than hosting them on-premise did not remove the risk of costly failures. The failure represents a major embarrassment for the company, which generated $US2.57 billion revenue in the latest quarter, based largely on the fact that it is perceived as being hugely reliable...

Google cloud falls over after routing error, strives to remove manual link activation

Grazed from CloudTech. Author: James Bourne.

Google Compute Engine went down for approximately 70 minutes last week, the company has confirmed, making certain Internet destinations unreachable from the europe-west1 region during that time. The issue first came to light at 1326 PST on November 23 with a status update, before a further missive at 1432 confirming the problems should have been resolved.

Four days later, Google explained what exactly went wrong. At 1151 PST on November 23, Google engineers activated a new peering link – with an unnamed provider who Google says it works with extensively – but during the activation, the providers’ estimations of how much capacity the link could take differed wildly from actual performance...

The Benefits of Cloud Computing - Protect your data from hardware mishaps

Grazed from LifeZette. Author: Dave Taylor.

Created any documents on your computer recently? Updated a spreadsheet while on a flight to Seattle, or edited a short movie for your child’s presentation? How about your smartphone: Taken any photos lately that you’d like to keep forever? Every single one of those files is as risk even as you sit and read this.

Storage devices and memory cards are much more reliable, but the problem is that we humans are still, well, human, and we spill coffee on our laptops, drop our cellphones and break them, and even lose tablets on airplanes in the hustle and bustle of getting off the flight and into the arms of loved ones. It’s inevitable, just as it’s inevitable with computers you’ll power up at some point and see an error message telling you that the hard drive is kaput...

Read more from the source @ http://www.lifezette.com/popzette/benefits-of-cloud-computing/

Cloud Computing: AWS glitch strikes Netflix and Tinder, offering a wake-up call for others

Grazed from NetworkWorld. Author: Katherine Noyes.

Netflix, Tinder and other major websites were affected for a time Sunday by glitches in Amazon Web Services' Northern Virginia facility, offering a cautionary lesson to other companies that rely on the cloud service for mission-critical capabilities. The problem manifested itself primarily in the form of higher-than-normal error rates. Sites affected reportedly also included IMDb and Amazon's Instant Video and Books websites.

At the heart of the snafu were issues with AWS's DynamoDB database, but it spread to include other services such as EC2, the mobile-focused Cognito service and the CloudWatch monitoring service, according to the AWS Service Health Dashboard. "The root cause began with a portion of our metadata service within DynamoDB," AWS explained in a dashboard update posted at 4:52 a.m. PDT on Sunday...

Cloud Computing: AWS Outage Doesn't Change Anything

Grazed from Forbes. Author: Justin Warren.

If the latest AWS outage changes anything in your approach to cloud adoption, then you’re doing it wrong. This is not the first AWS outage (I first wrote about one in 2011, back when I had hair), nor will it be the last. Nor will it be only AWS that suffers another outage at some point in the future.

We’ve already seen outages from Office365, Azure, Softlayer, and Gmail. Outages are a thing that happens, whether your computing is happening in your office, in co-location, or in ‘the cloud’, which is just a shorthand term for “someone else’s computer”. To think that putting applications ‘in the cloud’ magically makes everything better is naive at best...

Read more from the source @ http://www.forbes.com/sites/justinwarren/2015/09/20/aws-outage-doesnt-change-anything/

Disaster recovery experts dig down into Azure cloud outages over past 12 months

Grazed from CloudTech. Author: James Bourne.

The majority of Microsoft’s service errors in the first quarter of 2014 were advisory, while there were significantly more service interruptions in the following three quarters, according to analysis carried out by CloudEndure. The figures, taken from Azure’s Service Health Dashboard across last year, saw three full service interruptions in Q1, a whopping 28 in Q2, 16 in Q3 and zero in the final quarter.

The highest number of errors came in Q1 (259), yet also produced the lowest number of partial service interruptions (88), compared to 134, 129 and 127 for the other three quarters. The analysis came about after Azure suffered two debilitating outages last year; one in August, and one in November, which was caused by storage blob front ends going into an infinite loop – a process which went undetected during testing...

Read more from the source @ http://www.cloudcomputing-news.net/news/2015/jan/19/disaster-recovery-experts-dig-down-azure-cloud-outages-over-past-12-months/

Cloud Computing: Amazon data center on fire in Virginia

Grazed from Click2Houston.  Author: Editorial Staff.

A large fire lit up the roof of an Amazon data center that's still under construction in a Virginia suburb outside Washington, D.C.  No one was hurt in the blaze, according to the local fire department, which was able to put out the fire in under an hour.  Amazon said the facility was in the early stages of becoming an Amazon Web Services cloud computing center.

The company noted that the incident didn't impact company production or shipping.  The nondescript building will one day be one of Amazon's massive data centers -- the kind that keep websites running and store computer information for companies...

How to prepare for Verizon's 2-day cloud shutdown

Grazed from ComputerWorld. Author: Sharon Gaudin.

With some of Verizon's enterprise customers about to lose their cloud service for up to two days, now is the time for them to prepare for the extended downtime. "One of the major selling points of cloud computing is that it takes the burden of IT management off the shoulders of the customer, but with this outage, Verizon's customers are right back in the thick of things when it comes to IT management," said Dan Olds, an analyst with The Gabriel Consulting Group.

"They're going to have to figure out how to minimize the impact of the Verizon two-day fail on their business." Verizon confirmed to Computerworld this week that it will shut down its Verizon Cloud service to do maintenance for up to two days, starting at 1 a.m. ET Saturday. Users are being told to shut down their virtual machines at least an hour ahead of time...