cloud outages

Amazon Corrects Massive AWS S3 Cloud Outage While Vendors React

Article Written by David Marshall

Last Tuesday, parts of the Internet came to a grinding halt when the servers that powered them suddenly vanished.  The disappearing server act came from servers that were housed as part of Amazon S3, Amazon's popular Web hosting service.

When that incident happened, several big and popular services and Web sites were disrupted, including DraftKings, Gizmodo, IFTTT, Quora, Slack and Trello.

According to the Web site monitoring firm Apica, 54 of the largest online retailers experienced performance impairments on their Web sites, with some slowing down by more than 20 percent; 3 sites went down completely (Express, Lulu Lemon, One Kings Lane); and for effected websites, average slow down time was 29.7 seconds - 42.7 seconds to load.

What happened?

"At 9:37 a.m. PST, an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process," Amazon said.  "Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended.  The servers that were inadvertently removed supported two other S3 subsystems."

Those subsystems are important.  One of them "manages the metadata and location information of all S3 objects in the region," according to Amazon.  And without it, services that depend on it couldn't perform basic data retrieval and storage tasks.  The second subsystem, the placement subsystem, "manages allocation of new storage and requires the index subsystem to be functioning properly to correctly operate."  The placement subsystem is used to allocate storage for new objects.

IoT, outages show importance of a cloud backup and recovery strategy

Grazed from TechTarget. Author: David Linthicum.

Cloud backup and recovery have long been a priority for enterprises running production workloads in the cloud. But, today, as trends like the internet of things spur massive amounts of data for organizations to store and protect, IT teams must evolve their cloud backup and recovery strategy -- and make recovery a prime concern.

"Data protection is changing in the world of cloud computing, as IoT comes into play [and] big data systems come into play," says David Linthicum, SVP of Cloud Technology Partners, a cloud consulting firm based in Boston. "We have a lot more data to protect these days."...

Are Cloud Computing Scares a Dying Trend?

Grazed from SWNS. Author: Editorial Staff.

Cloud computing is on the rise, but behind the virtual cloud there isn’t always a silver lining. In recent months the mainstream media has picked up on some serious issues with the cloud services of some major companies and organisations – and that’s caused some to question the safety of this technology.

Indeed, whether it was the ASUS incident where the company put thousands of users at risk by failing to fix a flaw in their routers or the LA Hospital security breach where malware in an email allowed hackers to lockdown the system and demand a ransom, cloud services have taken a beating recently...

Verizon Cloud goes out in planned maintenance, aims for seamless updates going forward

Grazed from CloudTech. Author: James Bourne.

Over the weekend, Verizon’s cloud service, Verizon Cloud, was offline as it looked to add ‘seamless upgrade functionality as well as other customer-facing updates.’ The maintenance period was put in to improve the service and to ensure further updates went ahead without any hitches to customers.

The telco giant warned the fixes could take up to 48 hours, but was completed after 40, with Verizon taking the bizarre step of issuing a press release to announce the work had been done. “The seamless upgrade functionality allows Verizon to conduct major system upgrades without interrupting service or limiting infrastructure capacity,” the release states. “Traditionally, updates have been made via rolling maintenance and other methods...

Cloud Computing: Hacker group hints it caused North Korean Internet crash

Grazed from LATimes. Author: Editorial Staff.

Fresh Internet outages continued to plague North Korea on Tuesday, and speculation about the cause of the rogue country's systemwide crash earlier in the day broadened to include a hacking group that hinted it was responsible. North Korea's Internet connection went down about 2 a.m. Tuesday and wasn't restored for more than 9 1/2 hours, prompting speculation that the U.S. government might have waged a cyberattack against Pyongyang in retaliation for the Nov. 24 hacking of Sony Pictures Entertainment.

The FBI has accused North Korea of committing the attack on the Los Angeles-area studios where the controversial film "The Interview" was made, portraying a fictional assassination plot against North Korean leader Kim Jong Un. North Korea's online community consists of only about 1,000 Internet Protocol addresses, estimates the Dyn research firm that evaluates Internet performance worldwide...

Read more from the source @ http://www.latimes.com/world/asia/la-fg-north-korea-internet-outages-20141223-story.html

Amazon's CloudFront content delivery network recovers from two-hour outage

Grazed from GeekWire. Author: Blair Hanley Frank.

It’s the Murphy’s Law corollary for cloud computing: anything that can go wrong will go wrong on the eve of a major holiday. Amazon’s CloudFront content delivery network is suffering from DNS problems worldwide, which means some users are having a hard time connecting to web services that count on CloudFront to deliver their content.

According to reports on Twitter, people are having problems with sites like Medium and Instagram, though it’s hard to tell just how widespread the issues are. According to the AWS service status page, Amazon is looking into the problems and is working on a fix as of 5 p.m. PST tonight. The outage began around 4:15...

Read more from the source @ http://www.geekwire.com/2014/amazons-cloudfront-hits-snag-causing-problems-across-web/

Blob Front-End Bug Bursts Microsoft Azure Cloud

Grazed from IEEE. Author: Robert N. Charette.

It being the Thanksgiving holiday week in the United States, I was tempted to write once more about the LA Unified School District’s MiSiS turkey of a project, which the LAUSD Inspector General fully adressed a report [pdf] released last week. If you like your IT turkey burnt to a crisp, over-stuffed with project management arrogance, served with heapings of senior management incompetence, and topped off a ladleful of lumpy gravy of technical ineptitude, you’ll feast mightily on the IG report.

However, if you are a parent of the over 1,000 LAUSD school district students who still have not received a class schedule nearly 40 percent of the way into the academic year—or a Los Angeles taxpayer for that matter—you may get extreme indigestion from reading it. However, the winner of the latest IT Hiccup of the Week award goes to Microsoft for the intermittent outages that hit its Azure cloud platform last Wednesday, disrupting an untold number of customer websites along with Microsoft Office 365, Xbox Live , and other services across the United States, Europe, Japan, and Asia. The outages occurred over an 11-hour (and in some cases longer) period...

Read more from the source @ http://spectrum.ieee.org/riskfactor/computing/it/blob-frontend-bug-bites-microsoft-azure-cloud-?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+IeeeSpectrum+(IEEE+Spectrum)

Microsoft’s Azure outages: How does this affect the firm’s cloudy reputation?

Grazed from CloudTech.  Author: James Bourne.

Microsoft suffered a blow yesterday when its Azure cloud and virtual machines suffered a series of outages before later being restored.  According to Reuters, the downtime was due to interruptions in multiple centres, with a representative from the company explaining that a small section of its customer base was affected.

A cursory glance at Azure’s status history page gives a glimpse as to the various outages suffered, with downtime logged on both August 18 and 19.  “Starting at 18 Aug 2014, 17:49 UTC, we are experiencing an interruption to Azure Services, may include Cloud Services, Virtual Machines Websites, Automation, Service Bus, Backup, Site Recovery, HDInsight, Mobile Services and possible other Azure Services in multiple regions,” the update wrote. “Customers began to experience service restoration as updates were deployed across the affected environment.”...

Worried About Losing Your Cloud? Just Get Insurance

Grazed from BoxFreeIT. Author: Sholto Macpherson.

The North American arm of financial services giant Zurich is the latest insurer to introduce a plan for companies that use cloud computing services. The property coverage for mid-market companies protected against business interruption or extra expense in the case of a cloud computing service failing.

Niche insurers have popped up targeting risks in cloud computing, including local providers in Australia. An alliance of cloud computing providers launched an insurance package to provide protection for cloud services providers in April last year, but insurers had dragged their heels on releasing plans for business customers...

How to recover after a cloud computing misstep

Grazed from ITWorld. Author: Stacey Collet.

DreamWorks Animation knows the magic of the cloud. Since 2003, the famed studio has held its product development, design and manufacturing functions in a hybrid cloud environment, long before the storage option was even called "cloud." The cloud gives the Los Angeles-based company "massive flexibility in both human and digital capital," says DreamWorks CTO Lincoln Wallen, adding that it gives "any artist access to any movie from any site, anywhere, on any project... instantly."

It also allowed DreamWorks to move from producing one movie every 18 months to three movies a year. A blockbuster solution, no doubt. But things proved trickier when cloud options were weighed for corporate and back-office functions. In 2012, DreamWorks switched email systems from Microsoft Exchange to Gmail on the Google Apps platform to create a uniform framework for its 2,600 employees, half of whom used Linux for creating animation and the other half Microsoft tools for corporate functions...

Read more from the source @ http://www.itworld.com/cloud-computing/425298/how-recover-after-cloud-computing-misstep