Industry executives and experts share their predictions for 2021. Read them in this 13th annual VMblog.com series exclusive.
Off-prem leads changes for open source databases
By Peter Zaitsev, CEO and co-founder,
Percona
Databases are a big success story for open
source software. DB-Engines lists 185 open source database
management systems. While the popularity of both open and proprietary databases
is split roughly 50-50, there is an interesting graph that charts the steady increase
in open source database popularity since 2013 and shows where it intersects
with proprietary databases, which are on a steady downward arc. With the
adoption of open source databases continuing to rise, what should we expect in
2021?
Prediction
#1 - DBaaS will continue to grow and open source will disrupt
My first prediction is that we will see
continued Database as a Service (DBaaS) growth in 2021.
DBaaS enables users to setup, deploy, and
maintain databases through a UI that abstracts away much of the intricacies of
a specific database's implementation. The as-a-service approach has made
dealing with databases more convenient for developers and it has seen rapid
adoption. However, an open source DBaaS doesn't really exist as most on the
market, whether private or public, are generally proprietary or commercial
versions of open source data management systems, like for example MongoDB's
Atlas.
The DBMS world is ripe for open source DBaaS
disruption.
It's important to recognize that there is a pattern
with open source emergence: it's often not the first mover. Linux was not the
first web server operating system and Android was not the first phone OS, but
open source now dominates in these spaces. Users do not like lock-in but, at
the same time, they are drawn to convenience. If open source can provide 80% of
that convenience and create a good balance, it is going to acquire a
significant portion of the market.
Next year, we'll see new open source solutions
beginning to emerge from a number of vendors, but the move into the DBaaS space
will be slow and is likely to take longer than 12 months.
Prediction
#2 - More databases able to run on Kubernetes
Data will be the big story in Kubernetes in
2021. Next year, we will see more cloud-native databases, and databases that
are multi-cloud, but Kubernetes will see better support for scaling. Classic
databases, such as MySQL, MariaDB, and Postgres have been challenging to scale
out instead of scale up, and we will see additional available options. This is
something you can see already emerging in MySQL and Postgres, but these
developments are not yet as mature or as compatible as they need to be for
large-scale production deployments.
Prediction
#3 - Innovations in languages and data models
While SQL remains the de facto query language
for relational databases, which is the dominant type of DBMS, we are seeing an
increasing focus on new approaches to languages and data models for databases.
We will see new languages built on the back of
SQL. A good example of this kind of evolution is N1QL (pronounced ‘nickel')
from Couchbase, which has sought to make queries with distributed
document-orientated databases easier. It's a language based on SQL, but
designed for manipulating JSON documents.
We will also see entirely new query languages.
PromQL is a good example of this approach. Prometheus is the popular open
source choice for monitoring time-series databases and PromQL was designed from
scratch to create a clearer language that caters for the specific needs of
time-series database management. Even though it has taken users time to adjust
to, it is a query language gaining a lot of traction. This reflects that organizations,
and DBAs themselves, are willing to invest in new skills as they seek to analyze
and gain insight from their large volume of timestamped data.
Where there is demand for languages that cater
to specific domains, we will also see these appearing. This is something we
have already seen from the likes of InfluxDB, which developed a domain-specific
query language for its open source time-series database.
Another growing approach is meeting developers
where they are. Database vendors have recognized that, for instance, there are
over 11 million developers using JavaScript (Source: slashdata). They need to make it more
convenient for developers to connect their apps to data stores as they build a
user base. This is why we will continue to see API frameworks for specific
databases, which enable developers to choose their preferred API, such as JSON
over REST.
Essentially, we expect to see experimentation
continue in both data models and languages in 2021, as vendors try to appeal to
users that require easier data management solutions for their apps, as well as
users that are seeking more powerful ways to query their data.
Prediction
#4 - Innovations in analytics databases
Until recently, we hadn't seen much innovation
in analytically focused databases, and specifically in column stores, but
that's rapidly changing. We now have ClickHouse emerging along with QuestDB and
DuckDB, all of which originated in the open source space. We expect to see
significant growth for those technologies next year as companies demand (often
in real time), faster ways to analyze their massive data lakes. They aren't
direct competitors to the much hyped Snowflake, as Snowflake offers a far
broader solution, but as database engines they will have a strong impact.
Already, we are hearing from organizations that are looking to use ClickHouse,
for example, to achieve much better performance than something like Amazon
Redshift, which is a proprietary competitor.
An
exciting year with two camps emerging in open source
As our predictions suggest, the open source
database industry is going through an exciting period as we tackle the
challenge of data management in the cloud and on platforms such as Kubernetes.
As a consequence of these changes, the open source movement is also under
considerable strain, which in many ways appears to be cracking and breaking in
the middle. There is one camp, which sees the actions of AWS using open source
software to develop cloud-based DBs as bad for open source and their own
businesses. Their response has often been to adopt licensing that is not really
fully open source compliant in order to protect themselves from Amazon's
disruption.
There is another camp that acknowledges the
cloud disruption, but believes that open source is about embracing competition,
not restricting it. Kubernetes itself is a successful example of this. It's an
open source community governed by multiple companies which all work on making
Kubernetes better, but also bitterly compete by productizing Kubernetes
distributions - open source or not.
If companies choose to make their software
open source, they need to embrace the benefits and consequences of that choice.
Among other things, this means embracing competition. Ultimately, it will be
those open source companies that provide true value to their customers that
will succeed in the long term.
##
About the Author
Peter Zaitsev co-founded Percona and
assumed the role of CEO in 2006 growing the company from two people to over 200
professionals serving over 3,000 customers.