Chapter 07

Hooks 生命周期

利用 Helm Hooks 在安装、升级、删除等关键时间点插入自定义操作——数据库迁移、配置初始化、冒烟测试一网打尽。

Hook 类型与执行时机

Hook 通过在 K8s 资源的 annotations 中添加 helm.sh/hook 来声明。Helm 在特定时间点执行这些资源,并等待它们完成后再继续。

Hook 值执行时机典型用途
pre-install资源渲染后、安装前初始化数据库、创建证书
post-install所有资源安装后发送部署通知、执行测试
pre-upgrade升级前(资源更新前)数据库迁移、备份数据
post-upgrade升级完成后清理旧数据、发送通知
pre-delete删除前(资源删除前)备份数据、导出配置
post-delete所有资源删除后清理外部资源(如 DNS)
pre-rollback回滚前数据库回滚迁移
post-rollback回滚完成后发送回滚通知
testhelm test 触发时冒烟测试、健康检查
helm upgrade 执行流程(含 Hooks)

  ① 渲染模板
       
  ② 执行 pre-upgrade Hooks   ← 等待 Job/Pod 完成
       
  ③ 更新 K8s 资源(apply)
       
  ④ 等待资源 Ready
       
  ⑤ 执行 post-upgrade Hooks  ← 等待 Job/Pod 完成
       
  ⑥ 升级成功

Hook 权重:控制执行顺序

当同一时间点有多个 Hook 时,通过 helm.sh/hook-weight 控制执行顺序(从小到大,负数合法)。

# Hook 权重示例:先备份数据库,再运行迁移

--- backup-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ .Release.Name }}-backup"
  annotations:
    helm.sh/hook: pre-upgrade
    helm.sh/hook-weight: "-10"     # 权重越小越先执行
    helm.sh/hook-delete-policy: before-hook-creation
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: backup
        image: postgres:15
        command: ["pg_dump", "-h", "$(DB_HOST)", "-U", "postgres", "-f", "/backup/dump.sql"]

--- migrate-job.yaml(权重 0,在 backup 之后执行)
annotations:
  helm.sh/hook: pre-upgrade
  helm.sh/hook-weight: "0"       # 权重 0,在 -10 之后执行
  helm.sh/hook-delete-policy: before-hook-creation

Hook 删除策略

before-hook-creation
(推荐)在创建新 Hook 资源前删除已存在的同名旧 Hook 资源。确保每次 Hook 执行的是新鲜的 Job,避免"Job already exists"错误。
hook-succeeded
Hook 成功完成后删除资源。适合不需要保留日志的场景。
hook-failed
Hook 失败时删除资源。不常用,因为失败时通常需要保留 Pod 日志排查问题。
# 同时使用多个删除策略(逗号分隔)
annotations:
  helm.sh/hook: pre-upgrade,pre-install      # 同一 Hook 可应用于多个时间点
  helm.sh/hook-weight: "5"
  helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded

实战:数据库迁移 Job(pre-upgrade Hook)

# templates/hooks/db-migrate.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-db-migrate"
  namespace: {{ .Release.Namespace }}
  labels:
    {{- include "myapp.labels" . | nindent 4 }}
  annotations:
    "helm.sh/hook": pre-upgrade,pre-install
    "helm.sh/hook-weight": "0"
    "helm.sh/hook-delete-policy": before-hook-creation
spec:
  backoffLimit: 3
  template:
    metadata:
      labels:
        app.kubernetes.io/component: db-migrate
    spec:
      restartPolicy: Never
      initContainers:
      - name: wait-for-db             # 等待数据库就绪
        image: busybox:1.35
        command:
        - sh
        - -c
        - |
            until nc -z $DB_HOST $DB_PORT; do
              echo "Waiting for database..."; sleep 2
            done
            echo "Database is ready"
        env:
        - name: DB_HOST
          value: {{ .Values.database.host | quote }}
        - name: DB_PORT
          value: {{ .Values.database.port | quote }}
      containers:
      - name: migrate
        image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
        command: ["python", "manage.py", "migrate"]
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: {{ include "myapp.fullname" . }}-secrets
              key: database-url

实战:Smoke Test(test Hook)

# templates/tests/test-connection.yaml
apiVersion: v1
kind: Pod
metadata:
  name: "{{ include "myapp.fullname" . }}-test-connection"
  annotations:
    "helm.sh/hook": test
    "helm.sh/hook-delete-policy": before-hook-creation
spec:
  restartPolicy: Never
  containers:
  - name: wget
    image: busybox:1.35
    command:
    - wget
    args:
    - "--spider"
    - "-T"
    - "10"
    - "http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/health"
# 执行测试(helm test 命令)
helm test myapp

# 查看测试 Pod 的日志
helm test myapp --logs

# 带超时的测试
helm test myapp --timeout 5m

# 在 CI/CD 中使用(安装后立即测试)
helm upgrade --install myapp ./chart -f values-ci.yaml \
  --wait --timeout 5m
helm test myapp --timeout 3m
Hook 失败会导致 Release 失败

如果 pre-install/pre-upgrade Hook 失败,Helm 会将此次安装/升级标记为失败,不会继续部署主要资源。使用 helm upgrade --atomic 时,失败会自动回滚到上一个版本。因此 Hook 中的操作必须具有幂等性(可以安全地重复执行)。

本章小结

Helm Hooks 通过在资源 annotations 中设置 helm.sh/hook 实现生命周期钩子。pre-upgrade Hook 是运行数据库迁移的最佳时机(在新版应用部署前完成 schema 变更)。Hook 权重控制同时间点多个 Hook 的执行顺序,before-hook-creation 删除策略是最常用的安全模式。test Hook 配合 helm test 命令实现部署后的冒烟测试,是 CI/CD 流程的重要环节。