Virtualization Technology News and Information
The New Kubernetes Data Plane

By Zain Asgar, GVP & Product GM, New Relic and CNCF Governing Board (GB) member

Kubernetes has seen remarkable adoption over the last few years. Even for those of us who were early fans of the project, the pace and penetration of Kubernetes adoption exceeded expectations. Kubernetes provides incredibly powerful tooling for running workloads. Its configurability, extensibility, and expressiveness give us more power than ever to structure, optimize, and scale our applications.

As far as the industry has come in such a short time, we still haven't unlocked much of the power that Kubernetes offers. Now that the world has standardized on Kubernetes as the cloud-native control plane, the next phase of innovation will be on the data plane. Let's dive into what these terms mean and how the Kubernetes data plane unlocks new potential in observability, security, analytics, API testing, and more.

Control plane vs Data plane

Kubernetes acts as the control plane on today's cloud native applications. This means that control decisions about which applications to run, where to run them, how to allocate resources, when to rebalance are made by Kubernetes. The control plane is responsible for making the operational decisions about how an application runs, such as which pod to route traffic to. It doesn't concern itself with data-level information, such as the exact message in a particular request.

The data plane, on the other hand, provides rich information about the data flow through the application. It also provides measurements about what is happening in the system, such as golden signals, application metrics, and resource utilization.

The control plane and the data plane have separate responsibilities, but they work best together. The control plane should utilize information from the data plane in order to automatically make better decisions about what to do.

Today, there are multiple great projects fulfilling the role of Kubernetes data plane, although they have different points of emphasis. Consolidation around a de facto stack is still emerging. Let's dive into the state of the data plane today and how it will impact everyone running Kubernetes applications (which is to say, almost everyone).

The data plane makes Kubernetes even more extensible

Kubernetes is remarkably flexible and was built to be an extensible system. Specifically, developers can explicitly specify the rules for the control plane in an API-driven way. This extensibility improves use cases like autoscaling, rollbacks, health checks, and alerting, because you can customize them to the exact needs of your application. However, in order to tell Kubernetes exactly what to do, you have to either instruct it each time, or give it a rule to follow.

This can be a hardcoded rule, but it's better if it is informed by what your application is actually doing. For example, imagine you want to autoscale your application based on load. You can hardcode a rule to autoscale at a particular time of day, based on the idea that more traffic comes at that time. But what if you got a surge in traffic at an unexpected hour? It would be even better to tell Kubernetes that the application should scale up in proportion to the number of requests it is serving.

This is exactly the kind of use case that the Kubernetes data plane supports. When the Kubernetes control plane can use information from the data plane about what an application is doing and how it is running, it can make more informed decisions about what to do. This leads to better applications and less manual intervention from developers.

The Pixie open source project, which is a rising data plane for Kubernetes, surfaces rich, raw telemetry information like application requests and network traffic. Pixie can be used as an input to Kubernetes and other cloud native tools like Argo to support powerful automation use cases. For example, Pixie can be used to power intelligent rollback logic by exposing and analyzing the exact type of errors an application is seeing.

The Kubernetes data plane provides a unified data API

Kubernetes is well known for decoupling concerns. Infrastructure is decoupled from the application itself. Monoliths are unbundled into microservices, which can be shipped and scaled independently as needed.

While this decoupling is great for building and scaling applications, it makes it tricky to analyze applications or drive automated decisions about them in the control plane. This is another place the Kubernetes data plane comes in. The data plane unifies information across various sources (infrastructure vs application metrics, information about various microservices) into a single API that can be queried and analyzed in a unified way.

Prometheus is a well-adopted project within the Kubernetes community that provides a unified API for metrics. Application and infrastructure metrics from all of your pods, services, nodes, and other resources can be queried in one place. Looking at each of these pieces in isolation often doesn't tell the whole story - the behavior of infrastructure has a huge impact on application performance, and problems in one microservice often cascade to another. Having a unified API provides much better visibility into the overall system and how it is behaving.

The Kubernetes data plane provides a single source of truth across use cases and tools

Use cases like Kubernetes security and observability provide different types of value to different end users, but they generally rely on the same raw data. Currently, the most common solution for this is for each of those use cases to collect and store that exact same data separately.

This is suboptimal for a few reasons. First of all, it is inefficient to collect and store the data in multiple different places. This creates an unnecessary burden on the cluster resources (especially the network). Additionally, you end up with different sources of truth based on the same original observations, which can lead to conflicting information.

The Kubernetes data plane can serve multiple tools and use cases, rather than handling each one individually. Whether it is an observability tool, an API testing tool, or a security use case, the Kubernetes data plane can serve as a single source of truth.

The OpenTelemetry project has made great strides in this direction. By creating a unified format for telemetry data and a unified API for collecting that information, OpenTelemetry makes it possible to analyze the same exact dataset in any tool that supports the format.

The Kubernetes data plane is open source

There are multiple projects today that function as a Kubernetes data plane. They will need to interoperate together, using standards such as OpenTelemetry, in order to fulfill the full potential of the Kubernetes data plane. They will also need to integrate directly into application code in many cases. As a result, the Kubernetes data plane will be made from open source projects. Developers are increasingly resistant to adding vendor-specific code that they might have to rip out and replace next year. Instead, vendors should integrate with the Kubernetes data plane. The most forward-thinking vendors will add value on top of the raw data, while the developer can maintain a fully open source set of dependencies in their application.

There are many great projects serving as a Kubernetes data plane. This is not a comprehensive list, but a subset of great and influential projects in the space. While the Kubernetes data plane is still emerging, it will have a transformational impact on the way we operate Kubernetes clusters.


***To learn more about containerized infrastructure and cloud native technologies, consider joining us at KubeCon + CloudNativeCon Europe 2022, May 16-20.


Zain Asgar, GVP & Product GM, New Relic


Zain Asgar is currently the GVP/GM - Pixie and Open Source at New Relic, through the acquisition of Pixie Labs Inc. where he was the co-founder/CEO. Zain is also an Adjunct Professor of Computer Science at Stanford University and was an Entrepreneur in Residence at Benchmark before co-founding Pixie. He has a PhD from Stanford and has helped build at-scale data and AI/ML at Google AI, Trifacta and Nvidia.

Published Friday, May 06, 2022 7:31 AM by David Marshall
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<May 2022>