PayGo Ensures High Availability of SQL Server in the AWS Cloud with SIOS DataKeeper
October 23, 2019SIOS Technology Corp., the industry pioneer in providing IT Resilience through intelligent application availability, today announced that PayGo is using SIOS DataKeeper on Amazon Web Services (AWS) utilizing Elastic Compute Cloud (EC2) virtual servers with solid-state drive (SSD)-only storage for rapid, automatic failover needed to ensure high availability (HA) for the company’s mission-critical SQL Server applications.
PayGo
is an integrated utility payment solution provider that manages the largest
energy company prepay programs in the United States. PayGo is currently running
four production environments in AWS, with another coming online soon, with SQL
Server 2017 Standard Edition running on Windows Server 2012 R2 and plans to
migrate to Windows Server 2019 after testing is completed.
The Challenge
As
a private, non-profit organization, “Our backend SQL Servers hold terabytes of
data that must be available 24×7,” explained Chad Gates, senior director of
infrastructure and security, PayGo. “As a Windows shop, we prefer to use
Windows Server Failover Clustering (WSFC) for data protection and continuous
operation in case of any failures. But WSFC requires some form of shared
storage, like a storage area network (SAN) and that isn’t natively available in
AWS.”
With
AWS’s lack of shared storage, PayGo was forced to use SQL Server’s transaction
logging and log shipping to protect the data. Although requiring manual
intervention, this approach was acceptable for disaster recovery (DR) purposes.
But it could not provide the rapid, automatic failover capability needed to
ensure high availability (HA) for the company’s mission-critical applications.
“We had another option, but we believed there were more cost-effective
solutions,” according to Chad. “We could use the Always On Availability Groups
feature in SQL Server Enterprise Edition, but that would cost us hundreds of
thousands of dollars that could be spent on other mission critical initiatives.
We felt there must be a better solution, so we started looking for other
options.”
The Evaluation
In
its search for a capable and cost-effective HA solution, PayGo established four
criteria: seamless integration with Windows Server Failover Clustering; high
disk throughput performance to satisfy demanding recovery point and time
objectives; ease of implementation and dependable ongoing operation; and
responsive technical support from the vendor.
Receiving
a recommendation to look at SIOS, Chad concluded, “SIOS DataKeeper Cluster
Edition overcame the problem caused by the lack of shared storage. Its use of a
mirrored drive looks like shared storage to the WSFC. It was exactly what we
wanted.” SIOS DataKeeper also met PayGo’s other three criteria better than any
other solution considered.
The Solution
PayGo
first installed SIOS DataKeeper SANless Clustering software in its own private
cloud, and later migrated the configuration to AWS. “Because SIOS DataKeeper
supports private, public and hybrid cloud environments, we migrated the entire
configuration, including all application software and data, easily and without
any issues,” Chad recalled. PayGo currently has two SQL Server nodes in each of
its four SANless HA clusters. To provide protection against localized failures,
the servers are deployed in separate Availability Zones. And to ensure high
transactional throughput performance, each server has two network interfaces
with one dedicated to SIOS data replication. The SANless clusters employ
synchronous data replication through the sub-millisecond (ms) latency
connectivity AWS delivers between Availability Zones.
The Results
SIOS
DataKeeper met and exceeded PayGo’s high expectations for a high availability
solution, including ease of installation and operation, and responsive support.
“We have been using SIOS DataKeeper for several years now, and it has proven to
be the most rock-solid piece of software we have,” Chad claimed.
Given
its proven operation, including during actual failures, the IT team has
minimized the ongoing testing needed for its production SANless
clusters. The
clusters are now tested only after changes are made to any of the
hardware or
software, scheduled on a monthly basis, and the test itself consists of a
simple failover and failback. PayGo also upgrades only one node at a
time in
each cluster to simplify roll-back, if needed. With SIOS DataKeeper
performing
so well, the only reason PayGo now has for upgrading to SQL Server
Enterprise
Edition would be outgrowing the Standard Edition’s database size
limitation.
Looking Towards the Future
The
IT team at PayGo is currently considering adding DR protection to the HA
clusters by deploying a third node in a separate AWS region. The distance
involved in this case (between datacenters in Virginia and Ohio) experience a
latency of 12-13 ms. While that requires asynchronous replication to ensure
high throughput performance in the active node, the combined HA/DR solution
would recover much quicker than what is possible with log shipping.
“Whether you need to protect applications on a physical server, a private cloud, a public cloud or a hybrid cloud, you need to meet the same SLAs for application availability regardless of location. Applications running in clouds also need to be protected against the inevitable cloud outage through the use of availability zones and regions with automated intelligent failover,” said Frank Jablonski, VP of global marketing, SIOS Technology. “PayGo is using SIOS to provide a fast, easy way to deploy applications in a high availability environment in the AWS cloud while continuing to use Windows Server Failover Clustering.”