Shifting
left is a popular best practice for catching and addressing issues early in the
software development and deployment process, instead of waiting until issues
multiply later on, closer to or even after deployments. This is especially
pertinent in the world of DevOps to help with faster deployments, higher
quality code, and more cost-efficiency.
The
practice of shifting left has the benefit of smoother containerized app
deployments and better end-user experience for Kubernetes monitoring. The key
is getting developers involved early on in deployment cycles and making a
concerted effort as an organization to ensure things run smoothly, rather than
placing the onus all on quality assurance teams, security analysts, or ITOps.
This
image illustrates the relationship between the time invested in quality (on the
y-axis) and the project life cycle (on the x-axis). It suggests that investing
in quality early in the project lifecycle results in fewer security risks and
quality issues in the long run.
The
practice of shifting left is especially important in the case of container
orchestration (and therefore Kubernetes deployments) because you are coding,
scaling, and managing an application as one cohesive team. Breaking down
organizational silos allows you to pay more attention to the quality of your
applications. It is especially pertinent to make sure applications are built,
tested, and deployed in a consistent and reliable manner by operationalizing
your efforts as one team.
Some benefits of shift left include:
- Improved
customer experience in your
application due to fewer quality issues.
- Faster
time to market by reducing
the risk of delays with a more polished end product.
- Tighter
security by breaking down
organizational silos so that developers mitigate threats and bad actors earlier
in the app deployment process.
DevSecOps and Kubernetes Monitoring
There are challenges the organization may face
with the ephemerality of Kubernetes infrastructure that can translate to
security risks.
Kubernetes
is a sprawling platform composed of many parts. Each of those components
carries its own security issues and risks.
Here's
a rundown of the most attack vectors in a Kubernetes environment:
- Containers: Containers can contain malicious code that was
included in their container image. They can also be subject to
misconfigurations that allow attackers to gain unauthorized access under
certain conditions.
- Host operating systems: Vulnerabilities or malicious code
within the operating systems installed on Kubernetes nodes can provide
attackers with a path into Kubernetes clusters.
- Container runtimes: Kubernetes supports various container
runtimes, each may contain vulnerabilities that allow attackers to take control
of individual containers, escalate attacks from one container to another, and
even gain control of the Kubernetes environment itself. However, there is no
way to know or alert you if a vulnerability exists within your runtime, or if
an attacker is trying to exploit a vulnerability in the runtime.
- Network layer: Kubernetes relies on internal networks to
facilitate communication between nodes, pods, and containers. It also typically
exposes applications to public networks so that they can be accessed over the
Internet. Both network layers could allow attackers to gain access to the
cluster, or, as before, escalate attacks from one part to another.
- API: The Kubernetes API, which plays a central role in
allowing components to communicate and apply configurations, could contain
vulnerabilities or misconfigurations that enable attacks. Beyond following any
RBAC and security policy settings that you define, Kubernetes does nothing to
detect or respond to API abuse.
- Management tools: Kubectl, Dashboard, Helm, and other
Kubernetes management tools might be subject to vulnerabilities that allow
abuse on a Kubernetes cluster.
Built-in
Kubernetes security features
Kubernetes
offers native security functions to protect against the threats described
above, or at least to mitigate the potential impact of a breach. The main
security features offered by Kubernetes include
- Role-based access control (RBAC): Kubernetes allows admins to
define which users can access which resources within a namespace or an entire
cluster. Modern security best practices dictate that all tools that you are
using for deployment orchestration offer RBAC support.
- Pod security policies and network policies: Admins can
configure pod security policies and network policies, which restrict how
containers and pods behave. For example, pod security policies can be used to
prevent containers from running as the root user, and network policies can
restrict communication between pods.
- Network encryption: Kubernetes uses Transport Layer Security
(TLS) to encrypt network traffic, providing a safeguard against eavesdropping.
This cryptographic protocol is another common standard security best practice
and widely used in securing HTTPS, email, and messaging platforms.
While
these built-in Kubernetes security functions provide layers of defense against
certain attacks, they do not cover all threats. Kubernetes uses declaratively
run environments, offering no native protections against the following types of
attacks:
- Malicious code or misconfigurations inside containers or
container images: To scan for these, you would have to use a third-party
container scanning tool.
- Shadow IT deployments or changes: Simply not going through
your company's proper change management system and bypassing compliance will
cause significant Kubernetes security challenges.
- Security vulnerabilities on host operating systems: although
some Kubernetes distributions, like OpenShift, integrate SELinux or similar
kernel-hardening frameworks to provide more security at the host level, this is
not a feature of Kubernetes itself.
Kubernetes log analysis
While Kubernetes does have some built in security features,
Logs can help identify system vulnerabilities. One of the most important
aspects of the DevSecOps model is to begin implementing security measures as
early as possible in the development cycle. This is essentially a continuation
of the "shift-left" approach that's common in modern development philosophy.
Logs
are critical for achieving and maintaining Kubernetes infrastructure and
securing modern applications. By shifting left, your DevSecOps team is
continuously evaluating your application and logging information, which builds
a more secure foundation and helps identify issues and easier audits during
development or production. Log analysis software can help highlight
vulnerabilities for your security analysts to investigate.
As
time goes on and multiple releases of your organization's application(s) occur,
the DevSecOps team will become more efficient and more innately habitual about
employing secure development practices to weed out any security flaw before it
turns up in production. In this way, you will improve application security with
each subsequent release.
These
log files can then identify any lapses in application security that may occur
throughout the development process or even post-deployment to production. This
is where log analysis software can show significant value. While it is not
possible for humans to manually read each massive log file that is produced
while the application is being tested or utilized in production, Kubernetes logs can assist in highlighting
the vulnerabilities for DevOps teams to investigate further.
The role of OpenTelemetry in
DevSecOps
Logs
alone offer a flat interface that is great when analyzing system history, but
what would you do in the case of thousands of pods or hundreds of clusters?
This data needs to be parsed and analyzed. Moreover, you need to leverage the
power of metrics to serve as the smoke detector for identifying problems, and
traces in order to know where the problem occurred in your infrastructure. Your
team can best correlate these signals by adopting the OpenTelemetry standard.
With
OTel-native capabilities in observability software, DevSecOps teams can go
beyond the standard correlation and detection with metrics and logs. By
enabling the power of distributed tracing, teams can use automatic correlation
to help DevOps practitioners understand the exact data pathway any request
takes.
Distributed
tracing enables you to trace a specific request that caused an initial issue,
and monitor the progress of each step of the request fulfillment when the
metrics indicate a problem. The trace is a collection of spans arranged in the
order they occurred, enabling you to follow the request through every step it
took. By comprehending the sequence of events from the initial occurrence to
the final outcome, including the indicated problem, we can determine precisely
which part of your application needs attention and why.
Furthermore,
automatic data correlation of metrics, events, logs, and traces will enable
DevOps practitioners and security analysts to then use mathematical models to
identify and predict potential issues with their Kubernetes infrastructure.
This will allow these teams to potentially automate resolution from known
playbooks. Example use cases include providing enriched data, monitoring and
troubleshooting pod issues, optimizing and tracking resource utilization and
analyzing or providing insights into system alerts, and even self-healing
capabilities.
As
you adopt a shift left approach, there is much to consider in Kubernetes
infrastructure monitoring and OpenTelemetry truly enables best practices. OTel
standardization allows for vendor-neutral trace formats and common SpanEvent
formats that can be utilized in big data tools to produce in-depth analysis
from the data collection pipeline. Moreover, if both spans and logs are
transmitted by leveraging the Open Telemetry Line Protocol (OTLP) natively,
this allows for deeper analysis of a variety of data types; thus enabling
everyone on a DevSecOps team to shift left and find precisely where issues are
occurring throughout the deployment and release process.
##
To learn more about the transformative nature of cloud native applications and open source software, join us at KubeCon + CloudNativeCon Europe 2023, hosted by the Cloud Native Computing Foundation, which takes place from April 18-21.
ABOUT THE AUTHOR
Melissa Sussmann Lead Technical Advocate,
Sumo Logic
Melissa Sussmann is a developer advocate with 11+
years of domain expertise with experience as an engineer, product manager, and
product marketing manager for developer tools. She is currently the Lead
Technical Advocate at Sumo Logic. In her spare time, Melissa enjoys gardening,
reading, playing with her dog, and working on side projects. Some projects she
enjoys include running nodes on the lightning network, writing smart contracts,
running game servers, building and tinkering with dev kits, and
woodworking."