Virtualization Technology News and Information
Robin Systems 2017 Predictions: Containers Will Rule Data-Driven Applications

VMblog Predictions 2017

Virtualization and Cloud executives share their predictions for 2017.  Read them in this 9th annual series exclusive.

Contributed by Adeesh Fulay, Director of Products, Robin Systems

Containers Will Rule Data-Driven Applications

2017 will usher in a new world of enterprises using containers to manage data-driven applications including Cassandra, Hadoop and NoSQL. Container-based virtualization and microservice architecture will take the world by storm.

Currently, applications with a microservice architecture consist of a set of narrowly focused, independently deployable services, which are generally expected to fail. But the upside is increased agility and resilience. Agility is noteworthy because individual services can be updated and redeployed in isolation.

Because of the distributed nature of microservices, they can be deployed across different platforms and infrastructures, and the developers are forced to think about resilience from the ground up instead of as an afterthought. These are the defining principles for large web-scale and distributed applications, and web companies such as Netflix, Twitter, Amazon and Google have benefitted significantly with this model.

Add Containers to the Mix

Containers are fast to deploy, allow bundling of all dependencies required for the application, and are portable. This means you can write your application once and deploy it anywhere. Microservice architecture and containers combine to make applications that are faster to build and easier to maintain, while having overall higher quality.

The majority of the container ecosystem vendors have focused mostly on stateless applications. Why? Stateless applications are easy to deploy and manage. For example, they can respond to events by adding or removing instances of a service without needing to significantly change or reconfigure the application. For stateful applications, most container ecosystem vendors have focused on orchestration, which only solves the problems of deployment and scale, or existing storage vendors have tried to retrofit their current solutions for containers via volume plug-ins. Unfortunately, this is not sufficient.

Let's take the example of Cassandra, a modern NoSQL database, and look at the scope of management challenges that need to be addressed.

Cassandra Management Challenges

Poor schema design and query performance are the most prevalent problems, but they are rather application and use case specific, and require an experienced database administrator to resolve. I would bet most Cassandra admins, or any DBA for that matter, enjoy this task and pride themselves at being good at it. Let's a take at the Top 4 management tasks database admins would want to have automated and thereby avoid.

1.    Low Utilization and Lack of Consolidation

Cassandra clusters are, typically, created per use case. In fact the common practice is to give each team its own cluster. This would be an acceptable practice if clusters weren't deployed on dedicated physical servers. To avoid performance issues, most enterprises stay away from virtual machines. This means that underlying hardware has to be sized for peak workloads, leaving large amounts of spare capacity and idle hardware due to varying load profiles.

2.    Complex Cluster Lifecycle Management

Given the need for physical infrastructure (compute, network and storage), provisioning Cassandra clusters on premise can be time consuming. The challenge here is estimating the read and write performance that will be delivered by the designed configuration, which will require extensive testing and experimentation.

The other key planning exercise is for node failures. Failures are the norm, and have to be planned for from the get go. While Cassandra is designed to withstand temporary node failures, it still has to be resolved by adding replacement nodes, and it poses additional load on the remaining nodes for data rebalance - post failure and again post addition of new nodes.

3.    Manual Data Management

Unlike traditional databases such as Oracle, DB2, SQL Server, newer databases are still quite immature in their tooling. For example, Cassandra does not come with utilities that automatically back up the database. It only offers backup in terms of snapshots and incremental copies, but they are quite limited in features and have to be done per node. Similarly, data recovery is fairly involved, and requires manual steps on every node of the cluster.

The other challenge with large databases is cloning them for dev and test use. Traditional techniques are time consuming and require significant amounts of storage - thus restricting agility, which is increasingly becoming one of the key requirements for DevOps.

4.    Costly Scaling

With Cassandra's ability to scale linearly, most administrators are quite accustomed to adding nodes (or scale out) to expand the size of clusters. With each node you gain additional processing power and data capacity. But while node addition is good for steady increases in load, how does one deal with transient spikes? Administrators need the ability simply add and remove resources dynamically and at real time to their databases to deal with temporary load variations.

Many enterprises have experimented with Docker containers and open source orchestrators such as Mesos and Kubernetes, but they soon discover that these tools, along with their basic storage support in volume plugins, only solve the problem of deployment and scale, but are unable to address challenges with container failover, data and performance management, and the ability to take care of transient workloads.

Robin is a container-based, application-centric, server and storage virtualization platform software which turns commodity hardware into a high-performance, elastic, and agile application/database consolidation platform. Robin is designed to cater to not just stateless, but also performance and data-centric applications such as databases and Big Data clusters. Robin dramatically simplifies application and data lifecycle management with features such as one-click database deploy, snapshot, clone, time travel, dynamic IOPS control, and performance guarantees.


About the Author

Adeesh Fulay is Director of Products at Robin Systems. Prior to this he was a Product Line Manager, a Sr. Engineering Manager at VMware (IaaS & Software Solutions). He has also held positions at the director level at Oracle America where he introduced new Data Cloning feature 'Snap Clone', which enables creating space efficient copies of terabyte size databases in a matter of minutes. Prior to Oracle he was a Senior Professional Services Consultant at mValent Inc (acquired by Oracle America).

Adeesh Fulay

Published Wednesday, November 02, 2016 7:05 AM by David Marshall
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<November 2016>