iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🏍

How to Resolve Kubernetes Job Hanging with Sidecar Containers

に公開

Introduction

I encountered an issue where a Job using Cloud SQL Proxy as a sidecar container would not complete, so I will share the solution. The reason it doesn't complete is that the sidecar process continues to run even after the main process has finished.
Therefore, you must send a signal to the sidecar side to terminate the process at the point when the main process finishes.

Solution

There seem to be several types of solutions.

  1. First, share a volume between containers. When the main process finishes, create an arbitrary file in that volume. On the sidecar container side, run a loop that detects whether the file has been created in the volume, and terminate the process once it is created.
  2. Share the PID namespace between containers in the Pod, and kill the sidecar container's process from the main container side.

Since the second option was more concise, I adopted it this time.

my-db-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: my-db-job
spec:
  template:
    spec:
      restartPolicy: OnFailure
      shareProcessNamespace: true
      containers:
      - name: my-db-job-migrations
        command: ["/bin/sh", "-c"]
        args:
          - |
            <your migration commands>;
            sql_proxy_pid=$(pgrep cloud_sql_proxy) && kill -INT $sql_proxy_pid;
        securityContext:
          capabilities:
            add:
              - SYS_PTRACE
      - name: cloudsql-proxy
        image: gcr.io/cloudsql-docker/gce-proxy:1.17
        command:
          - "/cloud_sql_proxy"
          - "-instances=$(DB_CONNECTION_NAME)=tcp:3306"

The key point is that SYS_PTRACE is added to securityContext/capabilities.
Since adding SYS_PTRACE disables seccomp restrictions and lowers the security level, I'm not sure if it's recommended, but it is content written in the official documentation.

Share Process Namespace between Containers in a Pod

If you are concerned, I think it is better to go with method 1. I will also share a reference article.

Completing a Kubernetes Job that includes a sidecar container

Summary

I introduced a workaround for the problem where a Job with a sidecar container does not complete. When using Cloud SQL Proxy, you have to deal with issues unique to sidecars, and since there are surprisingly few articles in Japanese, I will continue to write any information that seems worth sharing.

Discussion