BlueData,
provider of the leading container-based software platform for AI and Big Data
workloads, today announced its new open source
Kubernetes
initiative: BlueK8s. The company also introduced the first open source
project in the BlueK8s initiative, Kubernetes Director (aka KubeDirector), for
deploying and managing distributed stateful applications with Kubernetes.
Kubernetes (aka K8s) is now the de facto standard for container
orchestration. Kubernetes adoption is accelerating for stateless applications
and microservices, and the community is beginning to evolve and mature the
capabilities required for stateful applications. But large-scale distributed
stateful applications - including analytics, data science, machine learning
(ML), and deep learning (DL) applications for AI and Big Data use cases - are
still complex
and challenging to deploy with Kubernetes.
Containerizing Stateful and Stateless Applications
Typically, stateless applications are microservices or containerized
applications that have no need for long-running persistence and aren't required
to store data. Cloud native web services (such as a web server or front end web
user interface) can often be run as containerized stateless applications since
HTTP is stateless by nature: there is no dependency on the local container
storage for the workload.
Stateful applications, on the other hand, are services that save data
to storage and use that data; persistence and state are essential to running
the service. These include databases as well as complex distributed
applications for Big Data and AI use cases: e.g. multi-service environments for
large-scale data processing, data science, and machine learning that use open
source frameworks such as Hadoop,
Spark, Kafka, and TensorFlow as well as a variety of different commercial tools
for analytics, business intelligence, ETL, and visualization.
In enterprise deployments, each of these different tools and
applications need to interoperate in a single cohesive environment for an
end-to-end distributed data pipeline. Yet they typically have many
interdependent services, and they require persistent storage that can survive
service restarts. They have dependencies on storage and networking, and state
is distributed across multiple configuration files.
The Kubernetes ecosystem has added building blocks such as Statefulsets
- as well as open source projects including the Operator framework, Helm, Kubeflow,
Airflow, and others - that have begun to address some of the requirements for
packaging, deploying, and managing stateful applications. But there are still
significant gaps in the deployment patterns and tooling for complex distributed
stateful applications in large-scale enterprise environments.
Introducing BlueK8s and KubeDirector
BlueData is committed to addressing these gaps, to help simplify and streamline
the containerized deployment of stateful applications in the enterprise.
The new BlueK8s initiative is focused on bringing enterprise-level capabilities
and support for distributed stateful applications to the Kubernetes open source
community. BlueData recently joined the Cloud Native Computing Foundation
(CNCF) - the organization behind Kubernetes and other cloud native open source
projects - in order to foster collaboration in this area with top developers,
end users, and vendors in the Kubernetes ecosystem.
KubeDirector is the first open source BlueK8s project, designed to help
with the deployment and management of complex distributed stateful applications
on Kubernetes. KubeDirector is a custom controller built using the Kubernetes custom
resource definition (CRD) framework that:
- Leverages the native
Kubernetes API extensions, design philosophy, and authentication
- Provides native
support for preserving application configuration and state
- Prioritizes
simplicity, and doesn't require decomposing applications to fit
microservices patterns
- Utilizes an
application-agnostic deployment pattern, minimizing the time to onboard
stateful applications to Kubernetes
- Supports the
deployment of large-scale distributed data pipelines consisting of
multiple applications for analytics, business intelligence, ETL,
visualization, data science, and ML / DL
"We're excited to introduce K8s and KubeDirector to the Kubernetes open
source community," said Kumar Sreekanti, co-founder and CEO of BlueData.
"Kubernetes is an ideal solution for stateless applications, but there are
still challenges in using it for large-scale, distributed stateful
applications. We're pleased to be working with CNCF and the Kubernetes open
source community to address these challenges, and incorporate our experience
with dozens of enterprise customers deploying large-scale, distributed stateful
applications in containerized environments."
KubeDirector is currently in pre-alpha and under active development.
Look for it soon at https://github.com/bluek8s/kubedirector.