Virtualization Technology News and Information
VMblog's Expert Interviews: SIOS Talks High Availability Options for Microsoft SQL Server in the Google Cloud


SIOS Technology is a provider of software products that help IT ensure the performance, efficiency and high availability protection of business-critical applications.  VMblog recently sat down with Microsoft Cloud and Datacenter Management MVP and SIOS Technical Evangelist, Dave Bermingham, to discuss high availability options for Microsoft SQL Server in the Google Cloud. 

VMblog:  While the cloud continues to grow in popularity for enterprise applications, IT departments remain reluctant to trust public clouds, including the Google Cloud Platform, for business-critical Microsoft SQL Server applications.  Why is this the case?

Dave Bermingham:  When it comes to cloud adoption, and trust in the public cloud, the reluctances comes down to data security and availability. When storing data in the cloud, how can you be sure it is secure? How do you protect against rogue employees of your cloud provider from accessing your data? Is the network secure at the edge? And the most important question of all, how are they keeping the systems up and running to ensure my applications are online? 

The question of security can be addressed. Encryption in transit and encryption at rest are standard features of the major public cloud providers at this time. And, add in the ability to manage your own private encryption key and you can safeguard your data against the rogue cloud employee gaining access. Network security is as good or better than anything available on-prem, but just like on-prem, security best practices must be followed to ensure a secure network.

Availability, however, is a bit of a black box. Although Google and others will give you a 99.95% SLA, the fact is if they miss their SLA, you get a refund of some portion of your monthly bill. Meanwhile, your business was offline and potential losses far exceed the refund check. Of course, the cloud providers generally have far greater resources than the typical data center, but they also service many more customers and tend to be much more agile in data center updates in order to keep up with the growing capacity demand and requests for new features. This in general could leave you susceptible to more outages compared to a data center that is purpose-built to deliver just the services you need, nothing more, nothing less.

VMblog:  Isn't the Google cloud already highly available?

Bermingham:  When talking about Infrastructure as a Service (IaaS), a common misconception is that you can just turn a virtual instance and it is already highly available and qualifies for the 99.95% SLA. If you read the fine print a little further, you will discover that the actual SLA only applies if you have two or more instances deployed across "zones". And, the guarantee is described as "loss of external connectivity or persistent disk access for all running instances, when Instances are placed across two or more zones in the same region." This SLA is pretty standard across all the major public cloud providers. Essentially what this means is if you have two or more virtual instances in different zones, Google will guarantee at least one of these instances will have "dial tone." 

What is doesn't guarantee though is that the services running in those instances will be available. For transactional databases such as SQL Server, or any application that writes persistent data to disk, you also have to consider how that data will be persisted across all your instances such that a failure of one instance does not result in data loss. And finally, how do other applications that rely on SQL Server respond to a failure of a SQL Server instance. Are they automatically redirected to an alternate instance in a different zone? Just like on-prem, the answer to high availability for SQL Server in the cloud is to deploy SQL Server Always On Availability Groups or SQL Server Failover Cluster Instances. In this configuration you provision each cluster node in different zones, or even different cloud regions.

VMblog:  What are the differences between SQL Server's Always On Availability Groups and Always On Failover Clustering?

Bermingham:  SQL Server Always On Availability Groups was introduced with SQL Server 2012 Enterprise Edition. Basically, I think of it as the next generation of SQL Database Mirroring.  It includes the ability to group multiple databases into an Availability Group to ensure they all failover together. It also leverages the quorum model and the client access point of failover clustering to help with failover logic and client redirection. It only replicates user defined databases, so system databases that include user accounts, Agent Jobs, etc. must be managed separately on each instance. As a newer technology, not all applications support it just yet. Be sure to verify your applications are compatible with Always On Availability Groups before choosing this type of high availability for SQL Server. 

SQL Server Failover Clustering has been the de facto standard for SQL Server high availability since the very beginning of SQL Server. It approaches high availability at an SQL instance level, rather than a database level. That means that everything, including the system databases, are protected. Being that it has been around forever, all applications and SQL features are going to work seamlessly with clustered instances. Another benefit of SQL Server Failover Clustering is that it is included with SQL Server Standard Edition, which can be much more cost effective than using Always On Availability Groups which requires SQL Server Enterprise Edition.

VMblog:  Are there any other gotcha's when planning for high availability in the Google Cloud?

Bermingham:  The virtual network in the Google Cloud does not support gratuitous ARP, so connecting directly to the cluster IP address does not work. Instead, a load balancer must be configured in the cloud which redirects client connections to the active cluster node. The same requirement holds true for clusters build in Azure as well. After the initial configuration the user experience is what the end user is accustomed to with on premise high availability configurations.

In addition to the network, if you want to build a SQL Server Failover Cluster Instance you must address the lack of a SAN in the cloud. This is done through SANless clustering software solutions like SIOS DataKeeper Cluster Edition.

VMblog:  How can SANless cluster software be used on the Google Cloud Platform to afford mission-critical high availability?

Bermingham:  One of the requirements of a traditional SQL Server Failover Cluster is a SAN, which holds all the clustered databases. Failover clustering controls access to the SAN through SCSI reservations, ensuring only the active node of the cluster has access to the data. Like other cloud providers, Google does not provide a SAN that can be used in a failover cluster. Instead, a software-based replication and cluster integration solution must be used to keep the data in sync across cluster nodes and present the local storage to Failover Clustering to take the place of the typical Shared Disk resource.

VMblog:  What other advantages does SANless cluster configurations provide in the Google Cloud?

Bermingham:  SANless clustering allows users to take full advantage of what the cloud has to offer. As I mentioned earlier, the cloud in general has far greater resources than the typical data center. Google currently has 15 different geographic locations throughout the world with another 4 locations slated to come online soon. Each location is connected with private fiber, allowing for fast, reliable connectivity across the globe. SANless clustering unlocks the potential of those data centers by allowing you to build clusters that not only span zones within a geographic region, but also allow you to add cluster nodes in different geographic regions, protecting you from outages that may impact a single zone.

However, many organizations are considering that the "cloud" itself is a single point of failure. What if Google, Azure, AWS or any other cloud provider are having a REALLY bad day and their cloud services are offline across multiple regions? If your only recourse is to call tech support and complain until it comes back online you probably should start dusting off your resume. SANless clustering solutions like SIOS DataKeeper allows you to not only replicate data between regions in the cloud, but also supports replication between multiple cloud providers or hybrid cloud configurations where you keep a copy of your data on-prem or in a different cloud provider for recovery in these worst-case scenarios.

VMblog:  Give readers some details about SIOS DataKeeper Cluster Edition and why it's unique.

Bermingham:  SIOS DataKeeper Cluster Edition is a host-based, block-level replication solution that integrates with failover clustering to allow you to build SANless clusters. It works with any version of Windows and any version of SQL Server. SIOS has been providing high availability for business-critical applications on Windows and Linux since 1999. SIOS is unique in that they provides high availability and disaster recovery solution for SQL server running on physical, virtual and cloud instances, but they also have availability solutions for SQL Server running on Linux. So regardless of your OS, version of SQL and preferred platform, SIOS has you covered for your high availability needs for not only SQL Server, but and other business critical application. 

VMblog:  Where's the best place for readers to go to find out more information?

Bermingham:  For more information about SIOS DataKeeper Cluster Edition, visit here.


Published Thursday, April 12, 2018 7:40 AM by David Marshall
High Availability Options for Microsoft SQL Server in the Google Cloud – Clustering For Mere Mortals - (Author's Link) - April 12, 2018 9:38 AM
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<April 2018>