close

DEV Community

Alexandre Vazquez
Alexandre Vazquez

Posted on

Debugging Distroless Containers: kubectl debug, Ephemeral Containers, and When to Use Each

Debugging Distroless Containers: kubectl debug, Ephemeral Containers, and When to Use Each

Why Distroless Breaks the Normal Debugging Workflow

Traditional container debugging assumes shell access with standard tools like ps, netstat, and curl. Distroless images intentionally exclude these utilities to reduce attack surface and CVEs. This creates an operational challenge: "when something goes wrong, you cannot use the tools that the process itself is not allowed to run."

Kubernetes addresses this through ephemeral containers, stabilized in version 1.25, which enable temporary debug containers to be injected into running pods.

Option 1: kubectl debug with Ephemeral Containers

The canonical solution uses ephemeral containers to inject a debug container sharing the target pod's network and process namespaces without modifying the original container or restarting the pod.

Basic invocation:

kubectl debug -it my-pod \
  --image=busybox:latest \
  --target=my-container
Enter fullscreen mode Exit fullscreen mode

The --target flag shares the process namespace of the specified container, enabling inspection via ps aux and /proc/ access.

For network diagnostics, use a richer image:

kubectl debug -it my-pod \
  --image=nicolaka/netshoot \
  --target=my-container
Enter fullscreen mode Exit fullscreen mode

Capabilities and Limitations

Ephemeral containers provide:

  • Full network namespace visibility
  • Process inspection via /proc/ (open files, environment variables, memory maps)
  • Pod-level DNS resolution access
  • Outbound network calls from the pod's network context

Ephemeral containers do not provide:

  • Direct application container filesystem access
  • Container removal after creation
  • Volume mount modifications via CLI
  • Resource limits support in the kubectl debug CLI

Accessing the Application Filesystem

The workaround for filesystem access uses the /proc filesystem:

# Browse via /proc
ls /proc/1/root/app/
cat /proc/1/root/etc/config.yaml

# Or chroot into the application's filesystem
chroot /proc/1/root /bin/sh
Enter fullscreen mode Exit fullscreen mode

The /proc//root symlink provides read access to the container's filesystem.

RBAC Requirements

Ephemeral containers require the pods/ephemeralcontainers subresource permission, separate from pods/exec:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: ephemeral-debugger
rules:
- apiGroups: [""]
  resources: ["pods/ephemeralcontainers"]
  verbs: ["update", "patch"]
- apiGroups: [""]
  resources: ["pods/attach"]
  verbs: ["create", "get"]
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]
Enter fullscreen mode Exit fullscreen mode

In production, scope this tightly with time-limited bindings and approval workflows.

Option 2: kubectl debug --copy-to (Pod Copy Strategy)

The --copy-to flag creates a full pod copy with modifications:

kubectl debug my-pod \
  -it \
  --copy-to=my-pod-debug \
  --image=my-app:debug \
  --share-processes
Enter fullscreen mode Exit fullscreen mode

This creates a new pod with the container image replaced. Add a debug container alongside the original:

kubectl debug my-pod \
  -it \
  --copy-to=my-pod-debug \
  --image=busybox \
  --share-processes \
  --container=debugger
Enter fullscreen mode Exit fullscreen mode

Limitations

The copy strategy does not debug the original pod because:

  • It lacks the original pod's in-memory state
  • It creates a new Pod UID, potentially triggering different admission policies
  • For crashing pods, the copy will also crash unless the entrypoint is modified

For crash debugging, combine with a modified entrypoint:

kubectl debug my-crashing-pod \
  -it \
  --copy-to=my-pod-debug \
  --image=busybox \
  --share-processes \
  -- sleep 3600
Enter fullscreen mode Exit fullscreen mode

Option 3: Debug Image Variants

Maintain a debug variant of your application image including shell tooling. Google distroless images provide :debug tags with BusyBox:

# Production image
FROM gcr.io/distroless/java17-debian12

# Debug variant
FROM gcr.io/distroless/java17-debian12:debug
Enter fullscreen mode Exit fullscreen mode

Chainguard images follow a similar pattern with :latest-dev variants that include apk and shell:

# Production
FROM cgr.dev/chainguard/go:latest

# Development/debug
FROM cgr.dev/chainguard/go:latest-dev
Enter fullscreen mode Exit fullscreen mode

For custom images, use multi-stage builds:

FROM golang:1.22 AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp .

FROM gcr.io/distroless/static-debian12 AS production
COPY --from=builder /app/myapp /myapp
ENTRYPOINT ["/myapp"]

FROM gcr.io/distroless/static-debian12:debug AS debug
COPY --from=builder /app/myapp /myapp
ENTRYPOINT ["/myapp"]
Enter fullscreen mode Exit fullscreen mode

Build both targets and push my-app:${VERSION} (production) and my-app:${VERSION}-debug (debug) to your registry.

Security Considerations

Debug image variants undermine distroless security benefits if deployed to production. Track usage carefully, require explicit approval, and ensure removal after debugging.

Option 4: cdebug

cdebug is an open-source CLI tool that simplifies ephemeral container debugging:

# Install
brew install cdebug

# Debug a running pod
cdebug exec -it my-pod

# Specify namespace and container
cdebug exec -it -n production my-pod -c my-container

# Use specific debug image
cdebug exec -it my-pod --image=nicolaka/netshoot
Enter fullscreen mode Exit fullscreen mode

cdebug adds:

  • Automatic filesystem chroot to the target container's filesystem
  • Docker container integration (cdebug exec)
  • No RBAC complications for Docker-based local development

The tradeoff is that it requires third-party tooling installation.

Option 5: Node-Level Debugging

For issues that ephemeral containers cannot address—pod crashing too fast, kernel-level problems, or tools requiring elevated privileges—node-level debugging provides direct container access from the host node:

kubectl debug node/my-node-name \
  -it \
  --image=nicolaka/netshoot
Enter fullscreen mode Exit fullscreen mode

From the privileged pod, use nsenter to enter container namespaces:

# Find the container's PID
crictl ps | grep my-container
crictl inspect  | grep pid

# Enter the container's namespaces
nsenter -t  -m -u -i -n -p -- /bin/sh

# Enter only network namespace
nsenter -t  -n -- ip a
Enter fullscreen mode Exit fullscreen mode

This approach enables running strace and other kernel-level tools:

# Trace all syscalls from the application process
nsenter -t  -- strace -p  -f -e trace=network
Enter fullscreen mode Exit fullscreen mode

RBAC and Security

Node-level debugging requires nodes/proxy and ability to create privileged pods. The debug pod runs with hostPID: true and hostNetwork: true, providing visibility into all node processes. Treat this as a break-glass procedure with dual approval, complete audit logging, and immediate cleanup.

Choosing the Right Approach: Access Profile Matrix

Scenario Technique Requirement
Active production incident, pod running kubectl debug + ephemeral container pods/ephemeralcontainers RBAC, k8s 1.25+
Pod crashing too fast to attach kubectl debug --copy-to + modified entrypoint Ability to create pods in namespace
Developer debugging in dev/staging cdebug exec or kubectl debug pods/ephemeralcontainers or pod create
Need full filesystem access kubectl debug --copy-to + debug image variant Debug image in registry, pod create
Need strace or kernel tracing Node-level debug with nsenter nodes/proxy, cluster admin equivalent
Network packet capture kubectl debug + nicolaka/netshoot pods/ephemeralcontainers
Local Docker debugging cdebug exec Docker socket access
CI-reproducible debug environment Debug image variant in separate build target Separate image tag in registry

Developer — Local or Development Cluster

Goal: Reproduce bugs, inspect configuration, verify service connectivity.
Approach: Debug image variants or cdebug.

Speed and iteration take priority. Build the debug variant and deploy it directly, or use cdebug exec for automatic filesystem root access.

Developer — Staging Cluster

Goal: Debug integration issues and environment-specific behavior.
Approach: kubectl debug with ephemeral containers (--target), scoped to own namespace.

Grant developers pods/ephemeralcontainers in their team's namespaces for self-service debugging without ops involvement.

Platform Engineer / SRE — Production

Goal: Diagnose live production incidents while minimizing risk.
Approach: kubectl debug with ephemeral containers.

Ephemeral containers satisfy production requirements:

  • They are recorded in API audit logs (who, when, which pod)
  • They do not modify the running application container
  • They are limited to the pod's network and process namespaces

Avoid --copy-to in production incidents because it creates a pod that may not exhibit the issue and adds load during an incident.

Platform Engineer — Production, Node-Level Issue

Goal: Diagnose kernel-level issues, container runtime problems, or multi-pod networking issues.
Approach: Node-level debug pod with nsenter. Treat as break-glass.

Create a dedicated RBAC role that grants nodes/proxy access only on-demand with separate authentication and time-limited bindings. Log all access.

Common Errors and Solutions

"ephemeral containers are disabled for this cluster"

Ephemeral containers require Kubernetes 1.16+ with the feature gate enabled. They are stable and always-on from Kubernetes 1.25.

"cannot update ephemeralcontainers" (RBAC)

You have pods/exec but lack pods/ephemeralcontainers. These are separate subresources.

"container not found" with --target

The container name in --target must match exactly. Verify with:

kubectl get pod my-pod -o jsonpath='{.spec.containers[*].name}'
Enter fullscreen mode Exit fullscreen mode

Can see processes but cannot read /proc/1/root

The ephemeral container may lack CAP_SYS_PTRACE capability. Use the Baseline PodSecurityStandards (PSS) profile for debug namespaces or explicitly add the capability:

securityContext:
  capabilities:
    add:
    - SYS_PTRACE
Enter fullscreen mode Exit fullscreen mode

tcpdump shows no traffic

Use tcpdump -i any to capture on all interfaces including loopback, where inter-container traffic travels.

Production RBAC Design

Separate three privilege tiers:

Tier 1: Developer self-service (team namespaces)

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: distroless-debugger
  namespace: team-namespace
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]
- apiGroups: [""]
  resources: ["pods/ephemeralcontainers"]
  verbs: ["update", "patch"]
- apiGroups: [""]
  resources: ["pods/attach"]
  verbs: ["create", "get"]
Enter fullscreen mode Exit fullscreen mode

Tier 2: SRE production incident access (all namespaces)

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: sre-distroless-debugger
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]
- apiGroups: [""]
  resources: ["pods/ephemeralcontainers"]
  verbs: ["update", "patch"]
- apiGroups: [""]
  resources: ["pods/attach"]
  verbs: ["create", "get"]
Enter fullscreen mode Exit fullscreen mode

Tier 3: Break-glass node access (time-limited binding recommended)

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: node-debugger
rules:
- apiGroups: [""]
  resources: ["nodes/proxy"]
  verbs: ["get"]
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["create", "get", "list", "delete"]
Enter fullscreen mode Exit fullscreen mode

Bind Tier 1 permanently to developers. Bind Tier 2 permanently to SREs with audit alerts on use. Bind Tier 3 only on-demand via a Kubernetes operator creating time-limited RoleBindings—never as a permanent ClusterRoleBinding.

Summary

Distroless containers reduce attack surface and CVEs, forcing a clean separation between application and tooling. Kubernetes provides ephemeral containers and kubectl debug as the clean answer: inject a debug container with necessary tools into the running pod, sharing its network and process namespaces, without restarting or modifying the application.

For scenarios ephemeral containers cannot address—filesystem access, crash debugging, kernel-level investigation—the copy strategy and node-level debug fill remaining gaps. The key to scaling this approach is the access model: developers get self-service ephemeral container access in their namespaces, SREs get cluster-wide ephemeral container access for production incidents, and node-level access is a break-glass procedure with audit trail and time limits.


Originally published at alexandre-vazquez.com/debugging-distroless-containers

Top comments (0)