By: Taylor Smith
Kubernetes
is a powerful tool with enough settings to deploy a performant, scalable, and
reliable cloud native application. There are also enough settings so that it's
hard to keep all security and compliance best practices straight. Writing
Kubernetes manifests to create a secure application is not straightforward and
keeping all of the correct security requirements in your head is nearly
impossible. Additionally, there are times when it feels easiest to loosen
security restraints temporarily, such as granting overly permissive access for
a container, until you forget to dial it back before deploying to production.
Overly
permissive access leads to risk, but when you add the scalability of IaC into
the mix, this risk can quickly multiply exponentially. We recently saw this in
the wild with open-source modules, where 47% of publicly-accessible Helm charts
in Artifact Hub contained a
misconfiguration. This means every resource deployed using these charts also
contained a misconfiguration unless the default configuration was updated.
In
addition to analyzing the state of open source Helm charts, we wanted to dig
into the most common misconfigurations found in Kubernetes overall. We took the
results of thousands of security scans of Kubernetes manifests and runtime
environments and aggregated the data to find the most common misconfiguration
at each of the four levels of severity (Low, Medium, High, Critical) plus one
more bonus high severity issue. For each misconfiguration, we'll walk through
the issue, why it's a security concern, and how to fix it. Although these are
well documented misconfigurations, they're still common, such as showing up in
the recommended deployment
for many popular services.
Namespaces
in Kubernetes create a logical separation for services that share the same
cluster, creating "virtual" clusters. They are useful for separating services
for security or resource allocation reasons when multiple applications share
the same cluster or if there are multiple stages of applications in the same
cluster (e.g., development, staging, production).
It
shouldn't be any surprise that the most common Low misconfiguration (and actually across all severities) is using
the default namespace. This isn't a bad thing for a single application, but in
shared clusters, you lose the logical separation if everything is deployed to
the default namespace. This makes it easier for a bad actor to access other
services, or one service to be a resource hog for another team. Namespaces,
along with other settings like resource limits, create those boundaries.
This
could be the most common misconfig for a variety of reasons. First, it could be
the power of defaults, where people simply apply a YAML file without adding a
namespace. Second, it could be that many of the manifests scanned didn't
include a namespace, but the namespace was defined when applying the yaml (kubectl apply pod.yaml --namespace
namespace1). The
best practice, however, is to include it in the YAML file to avoid accidentally
deploying to the default namespace. Third, if a cluster isn't shared or there
aren't concerns about services talking or being resource hogs, there isn't a
need for custom namespaces.
To
fix this, create a namespace if you don't already have one. You can do this
using the CLI, but let's follow the declarative path and create a development-namespace.yaml:
apiVersion: v1
kind:
Namespace
metadata:
name: development
labels:
name: development
Then apply that YAML with kubectl apply -f
development-namespace.yaml
Next,
add a namespace to the metadata section of the pod, secrets,
ConfigMap, etc. manifest with the following lines:
Metadata:
+
namespace: development
allowPrivilegeEscalation is part of the Pod Security Policies deprecated in Kubernetes
1.21 and set to be removed in 1.25. There will be replacements, but until then,
it's worth addressing the most common Medium
severity issue-containers running with allowPrivilegeEscalation.
allowPrivilegeEscalation allows a container
to run in privileged mode. If enabled, this allows the container to access the
host just like a host process, instead of a container isolated process. That
means that if a bad actor gets access to that container, they will be able to
exploit the host, such as monitoring traffic or leveraging the CRI to spin up a
cryptojacking container.
If a container absolutely needs access to host
capabilities, such as the host network or filesystem, set those capabilities
individually. However, this really should not be necessary in most cases.
To resolve the issue, add the securityContext with allowPrivilegeEscalation set to false in
the container spec for a pod.
spec:
containers:
- name: <container
name>
securityContext:
+
allowPrivilegeEscalation: false
The most common High misconfiguration is directly related to the previous issue,
but it is a more direct configuration. In fact, setting a container to
privileged or granting it CAP_SYS_ADMIN permissions will automatically set allowPrivilegeEscalation to true. The effect is the same-the container is spun up with root
access and no cgroup limits, so any successful exploits of that container will
have complete access to the host. What's interesting about this setting is that
it is by default set to false.
That means enough users are turning on
privileged containers to be our most common High severity violation. This is likely due to certain containers
needing a few Linux capabilities, such as a monitoring agent needing CAP_NET_RAW access, and it being easier to just grant that agent root access.
The more secure way to run a container is
without privilege access and with resources limited by cgroups. If your
container absolutely must have access to certain kernel capabilities, such as
CAP_NET_ADMIN for network monitoring, it should be granted one by one rather
than in a bundle.
spec:
containers:
- name: <container
name>
image: <image>
securityContext:
-
privileged: true
Checkov only has a few critical policies for
Kubernetes, so you know they are especially important. A majority of these
policies are for self-deployed Kubernetes instances where you have even more
ways to customize the control plane and ways to create a security
misconfiguration. Misconfigurations for insecurely deploying the Kubernetes API or kubelet API
open clusters to attacks that can grant bad actors complete control over your
cluster to do things like spin up cryptojacking containers or steal sensitive
data.
The most common Critical misconfiguration is
turning --kubelet-https to false. The --kubelet-https
flag ensures that traffic between the Kubernetes API
server and the kubelets is encrypted. By default, this is turned on and isn't
available for the managed offerings from the big cloud providers. We've run
into clusters where this setting is turned off, mostly in development
environments. If traffic is not encrypted, it's subject to man-in-the-middle
attacks.
In order to keep this turned on, either omit
that flag or, better yet, explicitly set it to true. You can add this as a flag
or do it declaratively in the commands section of the spec:
metadata:
creationTimestamp: null
labels:
component:
kube-apiserver
tier: control-plane
name: kube-apiserver
namespace: kube-system
spec:
containers:
- command:
+ -
kube-apiserver
+ -
--kubelet-https=true
We've already gone through one vanilla
Kubernetes High severity
misconfiguration, so let's do one for a managed Kubernetes offering.
When you configure an EKS cluster on Amazon,
the API endpoint is, by default, public to the world. It still requires IAM
permissions, but many organizations prefer a layered approach, limiting access
to the API endpoint to a bastion host or VPN.
To lower the attack surface, you can either
limit the CIDR blocks that can access the public API endpoint or set the API
endpoint to private. For the latter option, if you use the official CLI eksctl, create
your cluster with eksctl utils update-cluster-endpoints
--name=<clustername> --private-access=true --public-access=false. If you use Terraform, add these lines to your code:
module "eks_cluster" {
...
+
cluster_endpoint_private_access = true
+
cluster_endpoint_public_access =
false
}
Hardening Kubernetes clusters
against exploits
These are just five of the common
misconfigurations we've found. Kubernetes is powerful and has the capabilities
to run secure workloads if you configure it right. Use Checkov
to identify these misconfigurations and more in your Kubernetes manifests and
join us for a CNCF hosted webinar where we walk through securing a manifest
live.
##
To hear more
about cloud native topics, join the Cloud Native Computing Foundation and cloud native community at KubeCon+CloudNativeCon North America 2021 - October 11-15, 2021
ABOUT THE AUTHOR
Taylor Smith, Sr. Product Marketing
Manager, Bridgecrew
Taylor is a senior product marketing manager for Bridgecrew
by Prisma Cloud at Palo Alto Networks. He helps customers integrate security
into DevOps practices to secure the entire cloud native stack. Previously, he
held product marketing and strategy positions at Gremlin, Cisco and NetApp.