Skip to main content

INVESTIGATE: Version Pinning for Helm Charts and Container Images

IMPLEMENTATION RULES: Before implementing this plan, read and follow:

Created: 2026-02-27 Status: Backlog

Problem Statement

Everything works today, but 18 of 21 Helm charts and several container images have no version pinning. Any upstream release — intentional or accidental — can break the system without warning. A single ./uis deploy could pull a new chart version with breaking changes.


Current State

Helm Charts — Version Pinning Status

ServiceChartVersionStatus
argocdargo/argo-cd7.8.26PINNED
graviteegraviteeio/apim4.8.4PINNED
authentikauthentik/authentik2025.8.1PINNED
prometheusprometheus-community/prometheusUNPINNED
tempografana/tempoUNPINNED
lokigrafana/lokiUNPINNED
otel-collectoropen-telemetry/opentelemetry-collectorUNPINNED
grafanagrafana/grafanaUNPINNED
postgresqlbitnami/postgresqlUNPINNED
redisbitnami/redisUNPINNED
rabbitmqbitnami/rabbitmqUNPINNED
elasticsearchelastic/elasticsearch9.3.0PINNED
qdrantqdrant/qdrantUNPINNED
tikatika/tikaUNPINNED
open-webuiopen-webui/open-webuiUNPINNED
litellmoci://ghcr.io/berriai/litellm-helmUNPINNED
sparkspark-kubernetes-operator/spark-kubernetes-operatorUNPINNED
jupyterhubjupyterhub/jupyterhubUNPINNED
pgadminrunix/pgadmin4UNPINNED
redisinsightredisinsight/redisinsightUNPINNED
openmetadataopen-metadata/openmetadata1.12.1PINNED
mysql(manifest, no helm)N/A

Summary: 5 pinned, 17 unpinned out of 22 Helm charts.

Container Images — Version Pinning Status

Images explicitly set in manifests or config files:

ServiceImageTagStatus
whoamitraefik/whoamiv1.10.2PINNED
mongodbmongo8.0.5PINNED
rabbitmqbitnamilegacy/rabbitmq3.13.7-debian-12-r5PINNED
tikaapache/tika3.0.0.0PINNED
elasticsearchdocker.elastic.co/elasticsearch/elasticsearch9.3.0PINNED
openmetadatadocker.getcollate.io/openmetadata/server1.12.1PINNED
redisredis7.4FLOATING (minor)
mysqlmysql8.0FLOATING (minor)
postgresqlghcr.io/terchris/urbalurba-postgresqllatestUNPINNED
unity-catalogunitycatalog/unitycataloglatestUNPINNED
cloudflare-tunnelcloudflare/cloudflaredlatestUNPINNED
pgadmin initbusyboxlatestUNPINNED

Images controlled by Helm chart (not explicitly set in our config — chart decides):

  • prometheus, grafana, tempo, loki, otel-collector, qdrant, open-webui, litellm, spark, jupyterhub, pgadmin, redisinsight, authentik, argocd

Questions to Investigate

Q1: What is the right pinning strategy?

Options:

  • Pin everything — maximum stability, requires manual updates
  • Pin Helm charts only — charts control image versions, so pinning charts is sufficient
  • Pin charts + explicit images — pin what we control, let pinned charts manage their own images

Q2: Where should versions live?

Options:

  • In each playbookchart_version parameter in ansible helm tasks (current pattern for argocd/gravitee/authentik)
  • In a central versions file — single file listing all versions, sourced by playbooks
  • In config manifests — alongside other service config in manifests/*-config.yaml

Q3: How do we handle updates?

Options:

  • Manual — developer checks for updates periodically, updates versions, tests
  • Automated detection — script/CI that checks for newer versions and reports
  • Dependabot/Renovate — GitHub-native dependency update PRs

Q4: Helm repos — RESOLVED

05-install-helm-repos.yml was the original approach. The current pattern is that each playbook manages its own helm repo. The 2 repos still in 05-install-helm-repos.yml (bitnami, runix) are legacy — they should move into the playbooks that use them. No further investigation needed.

Q5: Bitnami subscription changes

Bitnami changed their distribution model (Aug 2025). RabbitMQ already uses bitnamilegacy image. Are other Bitnami charts affected? Will future updates break?


Helm Repos Inventory

RepositoryURLWhere Added
bitnamihttps://charts.bitnami.com/bitnami05-install-helm-repos.yml
runixhttps://helm.runix.net05-install-helm-repos.yml
graviteeiohttps://helm.gravitee.io090-setup-gravitee.yml
prometheus-communityhttps://prometheus-community.github.io/helm-charts030-setup-prometheus.yml
grafanahttps://grafana.github.io/helm-chartsmultiple playbooks
open-telemetryhttps://open-telemetry.github.io/opentelemetry-helm-charts033-setup-otel-collector.yml
argohttps://argoproj.github.io/argo-helm220-setup-argocd.yml
elastichttps://helm.elastic.co060-setup-elasticsearch.yml
qdranthttps://qdrant.github.io/qdrant-helm044-setup-qdrant.yml
open-webuihttps://helm.openwebui.com/200-setup-open-webui.yml
jupyterhubhttps://hub.jupyter.org/helm-chart/350-setup-jupyterhub.yml
authentikhttps://charts.goauthentik.io070-setup-authentik.yml
redisinsighthttps://mrnim94.github.io/redisinsight/651-adm-redisinsight.yml
spark-kubernetes-operatorhttps://apache.github.io/spark-kubernetes-operator330-setup-spark.yml
open-metadatahttps://open-metadata.github.io/openmetadata-helm-charts/340-setup-openmetadata.yml

Risk Assessment

High risk (unpinned chart + critical service):

  • postgresql (all data services depend on it)
  • redis (authentik depends on it)
  • elasticsearch

Medium risk (unpinned chart + important service):

  • grafana, prometheus, loki, tempo, otel-collector (observability stack)
  • open-webui, litellm (AI stack)
  • jupyterhub, spark (data science stack)

Lower risk (unpinned chart + admin/utility):

  • pgadmin, redisinsight, qdrant, tika

:latest images (highest breakage risk):

  • postgresql (custom image — we control this)
  • unity-catalog
  • cloudflare-tunnel
  • busybox (pgadmin init container)

Next Step

Investigate the questions above, then create a PLAN with a phased approach to pin versions across all services.