Gremlin Brings Chaos Engineering To Every Cloud Organization – Reducing System Downtime and Saving Millions

December 13, 2017 Off By David
Object Storage
Grazed from Gremlin

Gremlin helps companies build more resilient systems through a new engineering philosophy called chaos engineering. It is launching with the availability of its Gremlin tool and announcing Series A funding from Index Ventures and Amplify Partners. Starting today, any company will be able to employ chaos engineering to safely inject failure into systems in order to proactively identify and fix unknown faults – similar to an engineering flu shot.

Each year, North American businesses lose over $700 billion a year due to outages. In 2017 alone, major companies including Amazon, Whatsapp,Macys.com, and Slack have all experienced outages that impacted the bottom line and inconvenienced customers. This unreliability is due to the complexity gap in how distributed systems are built. Previously, software ran in a controlled, bare metal environment that introduced few variables, making it possible for engineering teams to identify potential risk and failures before they occurred. Within the last decade, systems have shifted to the cloud and become distributed with microservices and serverless methodologies, which introduced new dependencies on services outside of one’s control – creating complexity for any team of engineers to fully understand. This makes failure and outages inevitable.

"Having been an engineer at Amazon and Netflix for the past decade and on the front lines of system outages, this was a tool I built out of necessity. I was tired of the burnout from being paged at all hours of the night – there had to be a better way," said Kolton Andrus, CEO of Gremlin. "Chaos engineering is a new principle that is just starting to take hold, and I believe it is one of the most effective ways to make the internet more reliable. We have to empower engineers to safely experiment to build knowledge and more resilient systems."

Gremlin is helping companies and teams of engineers anticipate and mitigate failure before it occurs through its new tool that simulates how a system would react when encountering challenges, such as network latency, data center outages, etc. With nearly a dozen attacks and more launching soon, Gremlin recreates the most common failures across three categories: Resource, Network, and State. The tool is equipped with state-of-the-art security, including multi-factor authentication and principle of least privilege, as well as an undo button to drive safe, controlled experiments. Gremlin’s tool allows engineers to see how the system will behave in the face of failure, validates that defenses will work to prevent outages, minimizes the blast radius to allow for safe experimentation in production, and saves time and resources for engineering teams.

Today, the company counts Expedia, Twilio, Confluent, and Remind, as some of the customers using the Gremlin service.

In addition to the company and product launch, Gremlin is announcing $7.5 million in Series A funding led by Index Ventures, with participation by Amplify Partners. Combined with the previously raised Seed round, this brings the total amount raised to $8.75 million.

"In these times of being always-on and high customer expectations, you can’t afford for your business to be down even a few minutes," said Mike Volpi, General Partner at Index Ventures. "Chaos engineering is a breakthrough way to anticipate failure and build resilient systems. We’re thrilled to be partnering with Kolton, a leading pioneer of this movement, and the rest of the team to bring cloud engineering to every cloud-based company."

Gremlin is a subscription-based service, with pricing based on per instance or service. You can get started with Gremlin here.