INVESTIGATE: Version Pinning for Helm Charts and Container Images
IMPLEMENTATION RULES: Before implementing this plan, read and follow:
- WORKFLOW.md - The implementation process
- PLANS.md - Plan structure and best practices
Created: 2026-02-27 Status: Backlog
Problem Statement
Everything works today, but 18 of 21 Helm charts and several container images have no version pinning. Any upstream release — intentional or accidental — can break the system without warning. A single ./uis deploy could pull a new chart version with breaking changes.
Current State
Helm Charts — Version Pinning Status
| Service | Chart | Version | Status |
|---|---|---|---|
| argocd | argo/argo-cd | 7.8.26 | PINNED |
| gravitee | graviteeio/apim | 4.8.4 | PINNED |
| authentik | authentik/authentik | 2025.8.1 | PINNED |
| prometheus | prometheus-community/prometheus | — | UNPINNED |
| tempo | grafana/tempo | — | UNPINNED |
| loki | grafana/loki | — | UNPINNED |
| otel-collector | open-telemetry/opentelemetry-collector | — | UNPINNED |
| grafana | grafana/grafana | — | UNPINNED |
| postgresql | bitnami/postgresql | — | UNPINNED |
| redis | bitnami/redis | — | UNPINNED |
| rabbitmq | bitnami/rabbitmq | — | UNPINNED |
| elasticsearch | elastic/elasticsearch | 9.3.0 | PINNED |
| qdrant | qdrant/qdrant | — | UNPINNED |
| tika | tika/tika | — | UNPINNED |
| open-webui | open-webui/open-webui | — | UNPINNED |
| litellm | oci://ghcr.io/berriai/litellm-helm | — | UNPINNED |
| spark | spark-kubernetes-operator/spark-kubernetes-operator | — | UNPINNED |
| jupyterhub | jupyterhub/jupyterhub | — | UNPINNED |
| pgadmin | runix/pgadmin4 | — | UNPINNED |
| redisinsight | redisinsight/redisinsight | — | UNPINNED |
| openmetadata | open-metadata/openmetadata | 1.12.1 | PINNED |
| mysql | (manifest, no helm) | — | N/A |
Summary: 5 pinned, 17 unpinned out of 22 Helm charts.
Container Images — Version Pinning Status
Images explicitly set in manifests or config files:
| Service | Image | Tag | Status |
|---|---|---|---|
| whoami | traefik/whoami | v1.10.2 | PINNED |
| mongodb | mongo | 8.0.5 | PINNED |
| rabbitmq | bitnamilegacy/rabbitmq | 3.13.7-debian-12-r5 | PINNED |
| tika | apache/tika | 3.0.0.0 | PINNED |
| elasticsearch | docker.elastic.co/elasticsearch/elasticsearch | 9.3.0 | PINNED |
| openmetadata | docker.getcollate.io/openmetadata/server | 1.12.1 | PINNED |
| redis | redis | 7.4 | FLOATING (minor) |
| mysql | mysql | 8.0 | FLOATING (minor) |
| postgresql | ghcr.io/terchris/urbalurba-postgresql | latest | UNPINNED |
| unity-catalog | unitycatalog/unitycatalog | latest | UNPINNED |
| cloudflare-tunnel | cloudflare/cloudflared | latest | UNPINNED |
| pgadmin init | busybox | latest | UNPINNED |
Images controlled by Helm chart (not explicitly set in our config — chart decides):
- prometheus, grafana, tempo, loki, otel-collector, qdrant, open-webui, litellm, spark, jupyterhub, pgadmin, redisinsight, authentik, argocd
Questions to Investigate
Q1: What is the right pinning strategy?
Options:
- Pin everything — maximum stability, requires manual updates
- Pin Helm charts only — charts control image versions, so pinning charts is sufficient
- Pin charts + explicit images — pin what we control, let pinned charts manage their own images
Q2: Where should versions live?
Options:
- In each playbook —
chart_versionparameter in ansible helm tasks (current pattern for argocd/gravitee/authentik) - In a central versions file — single file listing all versions, sourced by playbooks
- In config manifests — alongside other service config in
manifests/*-config.yaml
Q3: How do we handle updates?
Options:
- Manual — developer checks for updates periodically, updates versions, tests
- Automated detection — script/CI that checks for newer versions and reports
- Dependabot/Renovate — GitHub-native dependency update PRs
Q4: Helm repos — RESOLVED
05-install-helm-repos.yml was the original approach. The current pattern is that each playbook manages its own helm repo. The 2 repos still in 05-install-helm-repos.yml (bitnami, runix) are legacy — they should move into the playbooks that use them. No further investigation needed.
Q5: Bitnami subscription changes
Bitnami changed their distribution model (Aug 2025). RabbitMQ already uses bitnamilegacy image. Are other Bitnami charts affected? Will future updates break?
Helm Repos Inventory
| Repository | URL | Where Added |
|---|---|---|
| bitnami | https://charts.bitnami.com/bitnami | 05-install-helm-repos.yml |
| runix | https://helm.runix.net | 05-install-helm-repos.yml |
| graviteeio | https://helm.gravitee.io | 090-setup-gravitee.yml |
| prometheus-community | https://prometheus-community.github.io/helm-charts | 030-setup-prometheus.yml |
| grafana | https://grafana.github.io/helm-charts | multiple playbooks |
| open-telemetry | https://open-telemetry.github.io/opentelemetry-helm-charts | 033-setup-otel-collector.yml |
| argo | https://argoproj.github.io/argo-helm | 220-setup-argocd.yml |
| elastic | https://helm.elastic.co | 060-setup-elasticsearch.yml |
| qdrant | https://qdrant.github.io/qdrant-helm | 044-setup-qdrant.yml |
| open-webui | https://helm.openwebui.com/ | 200-setup-open-webui.yml |
| jupyterhub | https://hub.jupyter.org/helm-chart/ | 350-setup-jupyterhub.yml |
| authentik | https://charts.goauthentik.io | 070-setup-authentik.yml |
| redisinsight | https://mrnim94.github.io/redisinsight/ | 651-adm-redisinsight.yml |
| spark-kubernetes-operator | https://apache.github.io/spark-kubernetes-operator | 330-setup-spark.yml |
| open-metadata | https://open-metadata.github.io/openmetadata-helm-charts/ | 340-setup-openmetadata.yml |
Risk Assessment
High risk (unpinned chart + critical service):
- postgresql (all data services depend on it)
- redis (authentik depends on it)
- elasticsearch
Medium risk (unpinned chart + important service):
- grafana, prometheus, loki, tempo, otel-collector (observability stack)
- open-webui, litellm (AI stack)
- jupyterhub, spark (data science stack)
Lower risk (unpinned chart + admin/utility):
- pgadmin, redisinsight, qdrant, tika
:latest images (highest breakage risk):
- postgresql (custom image — we control this)
- unity-catalog
- cloudflare-tunnel
- busybox (pgadmin init container)
Next Step
Investigate the questions above, then create a PLAN with a phased approach to pin versions across all services.