PostgreSQL Container for Urbalurba Infrastructure
File: docs/package-postgresql-container.md
Purpose: Complete documentation for custom PostgreSQL container with AI/ML extensions and CI/CD automation
Target Audience: Infrastructure engineers, DevOps teams, and developers working with PostgreSQL extensions
Last Updated: September 22, 2024
๐ Overviewโ
This document covers the custom PostgreSQL container (containers/postgresql/) that extends the official Bitnami PostgreSQL image with additional extensions required for the Urbalurba platform. This custom container provides enhanced functionality for modern applications requiring vector search, geospatial data, and hierarchical data structures.
The container is automatically built and published via GitHub Actions CI/CD to GitHub Container Registry with full multi-architecture support.
Why Custom Container?โ
The standard Bitnami PostgreSQL image doesn't include certain extensions that are crucial for modern AI and data-intensive applications:
- pgvector: Essential for vector search and AI embeddings
- PostGIS: Required for geospatial data types and queries
- hstore: Enables key-value storage within PostgreSQL columns
- ltree: Supports hierarchical/tree-like data structures
Extensions Includedโ
| Extension | Version | Purpose |
|---|---|---|
| pgvector | Latest | Vector similarity search and AI embeddings |
| PostGIS | 3.x | Geospatial data types, functions, and indexing |
| hstore | Built-in | Key-value pairs in single values |
| ltree | Built-in | Hierarchical tree-like data |
| uuid-ossp | Built-in | UUID generation functions |
| pg_trgm | Built-in | Trigram matching for fuzzy text search |
| btree_gin | Built-in | Additional indexing methods |
| pgcrypto | Built-in | Cryptographic functions |
Container Detailsโ
- Base Image:
bitnami/postgresql:16 - Registry:
ghcr.io/terchris/urbalurba-postgresql - Architectures:
linux/amd64,linux/arm64 - Security: Runs as non-root user (UID 1001)
- Optimization: Multi-stage build for minimal size
- Package Source: PostgreSQL official repository (PGDG) for latest packages
- Build Dependencies: wget, ca-certificates, gnupg, lsb-release (cleaned up after build)
Development Workflowโ
Optimal Multi-Architecture Developmentโ
Take advantage of your local hardware for comprehensive testing:
# 1. Local Development & Testing (Your Mac = Native ARM64)
cd containers/postgresql
./build.sh --single-arch
# โ
Native ARM64 build and testing
# โ
All 8 extensions validated
# โ
Performance testing without emulation
# 2. Commit and Push Changes
git add .
git commit -m "PostgreSQL container improvements"
git push
# 3. CI/CD Automatically Handles:
# โ
AMD64: Full functional testing (GitHub Actions)
# โ
ARM64: Build verification (GitHub Actions)
# โ
Multi-arch: Registry publishing (GitHub Actions)
# โ
Security: Vulnerability scanning (GitHub Actions)
# 4. Result: Both architectures fully validated!
Why This Works Perfectlyโ
- Your Mac: Native ARM64 testing (real performance, all features)
- GitHub Actions: Native AMD64 testing (most common deployment)
- Combined: Complete multi-architecture confidence
- No Emulation: Fast, reliable, production-representative testing
Local Developmentโ
Prerequisitesโ
- Docker Desktop with BuildKit enabled
- For multi-arch builds:
docker buildx create --use - Internet access for PGDG repository during build
- Sufficient disk space for multi-stage build (PostgreSQL 16 + pgvector images)
Build Locallyโ
cd containers/postgresql
# Build single architecture (recommended for local testing)
./build.sh --single-arch
# On Apple Silicon Macs: This builds and tests ARM64 natively!
# On Intel/AMD64: This builds and tests AMD64 natively!
# Build multi-architecture (requires push to registry)
./build.sh --push
# Build specific version
./build.sh --version v1.2.0 --push
# Show all options
./build.sh --help
Build Optionsโ
--single-arch: Build for current architecture only (faster, good for local testing)--push: Push to registry (required for multi-arch builds)--version VERSION: Set custom version tag--platform PLATFORMS: Set target platforms (default: linux/amd64,linux/arm64)--help: Show detailed usage information
Test Locallyโ
# Run the container
docker run -d --name postgres-test \
-e POSTGRESQL_PASSWORD=testpass123 \
-e POSTGRESQL_POSTGRES_PASSWORD=testpass123 \
-e POSTGRESQL_DATABASE=testdb \
-p 5432:5432 \
ghcr.io/terchris/urbalurba-postgresql:latest
# Wait for PostgreSQL to be ready
docker exec postgres-test pg_isready -U postgres
# Test all extensions (with authentication)
docker exec -e PGPASSWORD=testpass123 postgres-test psql -U postgres -d testdb -c "
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS hstore;
CREATE EXTENSION IF NOT EXISTS ltree;
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS \"uuid-ossp\";
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS btree_gin;
CREATE EXTENSION IF NOT EXISTS pgcrypto;
"
# Verify extensions are installed
docker exec -e PGPASSWORD=testpass123 postgres-test psql -U postgres -d testdb -c \
"SELECT extname FROM pg_extension WHERE extname NOT IN ('plpgsql') ORDER BY extname;"
# Cleanup
docker stop postgres-test && docker rm postgres-test
Automated Testingโ
The build script includes comprehensive automated testing:
- Verifies all 8 extensions install correctly
- Tests basic functionality of each extension
- Validates vector operations, geospatial queries, and hierarchical data
- Automatically cleans up test containers
Integration with Urbalurba Infrastructureโ
In Kubernetes Manifestsโ
Update your manifests/042-database-postgresql-config.yaml:
spec:
template:
spec:
containers:
- name: postgresql
image: ghcr.io/terchris/urbalurba-postgresql:latest
In Ansible Playbooksโ
Reference in ansible/playbooks/040-database-postgresql.yml:
- name: Deploy custom PostgreSQL
kubernetes.core.k8s:
definition:
spec:
template:
spec:
containers:
- image: ghcr.io/terchris/urbalurba-postgresql:latest
๐ CI/CD Pipeline with GitHub Actionsโ
The container is automatically built and published via GitHub Actions using the workflow at .github/workflows/build-postgresql-container.yml. This provides comprehensive automation for building, testing, and publishing the PostgreSQL container.
Workflow Overviewโ
File: .github/workflows/build-postgresql-container.yml
Purpose: Automated multi-architecture container builds with security scanning and testing
Workflow Triggersโ
The workflow runs automatically on:
- Push to main - When PostgreSQL container files change (
containers/postgresql/**) - Pull requests - For testing changes before merge
- Release tags - For versioned releases
- Manual dispatch - With custom version parameters
Multi-Job Architectureโ
1. Build Job ๐โ
- Platform: Ubuntu (GitHub Actions)
- Duration: ~15 minutes
- Architectures:
linux/amd64,linux/arm64 - Registry: GitHub Container Registry (
ghcr.io)
Key Features:
- Multi-architecture builds using Docker Buildx
- Automatic tagging based on trigger type
- BuildKit caching for faster subsequent builds
- Metadata generation with OCI labels
Generated Tags:
latest- Main branch pushesv1.0.0- Release tagspr-123- Pull request buildsmain-abc1234-20241201- SHA-based unique tags
2. Security Scan Job ๐ก๏ธโ
- Platform: Ubuntu (GitHub Actions)
- Duration: ~5 minutes
- Tool: Trivy vulnerability scanner
- Scope: CRITICAL and HIGH severity issues
Security Features:
- SARIF report generation for GitHub Security tab
- Human-readable vulnerability summaries
- Automatic upload to GitHub Security dashboard
- Failure tolerance (workflow continues even with vulnerabilities)
3. Test Job ๐งชโ
- Platform: Ubuntu (GitHub Actions)
- Duration: ~10 minutes
- Architecture: AMD64 (native testing)
- Extensions Tested: All 8 extensions validated
Testing Strategy:
- AMD64: Full functional testing on native GitHub runners
- PostgreSQL Client: Uses official PostgreSQL client tools
- Extension Validation: Creates and tests all extensions
- Functional Tests: Vector operations, geospatial queries, hierarchical data
- Health Checks: Built-in container health monitoring
Why AMD64 Only for Testing:
- GitHub Actions runners are AMD64-only (no native ARM64)
- QEMU emulation for ARM64 testing is unreliable for complex containers
- ARM64 build verification happens separately (see ARM64 Verification job)
4. ARM64 Verification Job ๐โ
- Platform: Ubuntu (GitHub Actions)
- Duration: ~3 minutes
- Purpose: Verify ARM64 images build and can be pulled
Verification Process:
- Manifest inspection for ARM64 variant
- Image pull test for ARM64 architecture
- Basic image metadata validation
- No emulated runtime testing (by design for reliability)
Workflow Permissionsโ
permissions:
contents: read # Read repository contents
packages: write # Push to GitHub Container Registry
security-events: write # Upload security scan results
Environment Variablesโ
env:
REGISTRY: ghcr.io
IMAGE_NAME: ghcr.io/${{ github.repository_owner }}/urbalurba-postgresql
Manual Workflow Dispatchโ
The workflow supports manual triggering with custom parameters:
# Via GitHub UI or gh CLI:
gh workflow run build-postgresql-container.yml \
-f version=v1.3.0 \
-f push_image=true
Parameters:
version: Custom container version tagpush_image: Whether to push to registry (default: true)
Workflow Outputsโ
Each workflow run generates:
- Build artifacts: Multi-architecture container images
- Security reports: Vulnerability scan results in GitHub Security tab
- Test results: Extension functionality validation
- Workflow summary: Human-readable results with tags, digests, and test status
Registry Integrationโ
- Registry: GitHub Container Registry (
ghcr.io) - Authentication: Automatic via
GITHUB_TOKEN - Visibility: Public (linked to repository)
- Retention: Follows GitHub's container retention policies
CI/CD Best Practices Implementedโ
-
Security First:
- No secrets in code or logs
- Vulnerability scanning on every build
- Read-only permissions where possible
-
Multi-Architecture Support:
- Native AMD64 testing for reliability
- ARM64 build verification for compatibility
- Manifest inspection for architecture validation
-
Caching Strategy:
- GitHub Actions cache for Docker builds
- Layer caching for faster subsequent builds
- Maximum cache mode for optimal performance
-
Error Handling:
- Retry logic for container startup
- Comprehensive debugging output
- Graceful failure handling
-
Observability:
- Detailed workflow summaries
- Test result reporting
- Build artifact tracking
Image Tagsโ
ghcr.io/terchris/urbalurba-postgresql:latest- Latest stable buildghcr.io/terchris/urbalurba-postgresql:v1.0.0- Specific versionghcr.io/terchris/urbalurba-postgresql:pr-123- Pull request builds
Multi-Architecture Supportโ
CI/CD Testing Strategyโ
Our testing approach balances reliability with multi-architecture support:
AMD64 Testing (Full):
- โ Native testing on GitHub Actions runners
- โ Complete functional tests with all 8 extensions
- โ Performance validation and integration testing
ARM64 Verification (Build-Only):
- โ Multi-architecture build verification
- โ Image availability and pull testing
- โ Manifest inspection and architecture validation
- โ ๏ธ No emulated runtime testing (avoided due to QEMU reliability issues)
Why This Approach?โ
GitHub Actions Limitation: Standard GitHub-hosted runners are AMD64-only. ARM64 container testing requires QEMU emulation, which:
- Is significantly slower (5-10x overhead)
- Has reliability issues with complex containers
- Can produce false failures due to timing/emulation problems
- Doesn't represent real ARM64 performance
Production Confidence:
- AMD64 gets full testing (most common deployment target)
- ARM64 build process is verified (image exists and is pullable)
- Multi-architecture manifest is validated
- Production ARM64 deployments can be validated separately
ARM64 Testing (Apple Silicon Mac)โ
If you're on Apple Silicon (M1/M2/M3), you can test ARM64 natively:
# Native ARM64 testing on Apple Silicon
./build.sh --single-arch
# This provides:
# โ
Native ARM64 performance (no emulation)
# โ
Complete functional testing
# โ
All 8 extensions validated
# โ
Real-world ARM64 confidence
ARM64 Production Validationโ
For ARM64 production deployments, validate manually:
# On ARM64 hardware (Apple Silicon, AWS Graviton, etc.)
docker run --rm \
-e POSTGRESQL_PASSWORD=testpass \
-e POSTGRESQL_DATABASE=testdb \
ghcr.io/terchris/urbalurba-postgresql:latest \
psql -U postgres -d testdb -c "CREATE EXTENSION vector; SELECT version();"
Usage Examplesโ
Basic PostgreSQL with Extensionsโ
-- Connect to database
\c your_database
-- Create vector search table
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT,
embedding vector(1536),
metadata hstore
);
-- Create spatial data table
CREATE TABLE locations (
id SERIAL PRIMARY KEY,
name TEXT,
coordinates GEOMETRY(POINT, 4326)
);
-- Create hierarchical data
CREATE TABLE categories (
id SERIAL PRIMARY KEY,
path ltree
);
Performance Indexesโ
-- Vector similarity index
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops);
-- Spatial index
CREATE INDEX ON locations USING gist(coordinates);
-- Hierarchical index
CREATE INDEX ON categories USING gist(path);
Troubleshootingโ
Common Issuesโ
- Extension not available: Verify the container is using the custom image
- Permission denied: Extensions require superuser privileges during installation
- Build failures: Check Docker BuildKit is enabled
- Package not found errors: The build now automatically adds PostgreSQL official repository (PGDG)
- Architecture testing: Use
./build.sh --single-archon your Mac for native ARM64 testing
Fixed Issues (v1.2.0+)โ
PostgreSQL 16 package availability:
- Problem:
postgresql-16-postgis-3andpostgresql-contrib-16packages not found in Debian repositories - Solution: Automatically adds PostgreSQL official repository (PGDG) during build
- Impact: All PostgreSQL 16 packages now available
Authentication in tests:
- Problem: psql commands failing with "no password supplied" error and container startup failing with "POSTGRESQL_PASSWORD environment variable is empty"
- Solution: Added both
POSTGRESQL_PASSWORDandPOSTGRESQL_POSTGRES_PASSWORDenvironment variables, plusPGPASSWORDfor test commands - Impact: Container startup and automated testing now work reliably
SQL syntax in extension verification:
- Problem:
ORDER BYclause error instring_agg()queries - Solution: Moved
ORDER BYinside the aggregate function - Impact: Extension verification queries now work correctly
Container cleanup conflicts:
- Problem: Test containers with same name causing build failures
- Solution: Added pre-test cleanup and robust error handling
- Impact: Builds can run repeatedly without manual cleanup
Debug Commandsโ
# Check running container
kubectl exec -it postgresql-pod -- psql -U postgres -c "SELECT extname, extversion FROM pg_extension;"
# Verify image
kubectl describe pod postgresql-pod | grep Image:
# Check logs
kubectl logs postgresql-pod
# Local debug - check extension files
docker exec postgres-test ls -la /opt/bitnami/postgresql/lib/vector.so
docker exec postgres-test ls -la /opt/bitnami/postgresql/share/extension/vector*
# Local debug - verify PGDG repository was added
docker exec postgres-test cat /etc/apt/sources.list.d/pgdg.list
Maintenanceโ
Updating Base Imageโ
- Update
FROM bitnami/postgresql:16to newer version in Dockerfile - Test locally with
./build.sh - Create pull request
- GitHub Actions will build and test
- Merge triggers automatic deployment
Adding New Extensionsโ
- Add installation commands to Dockerfile
- Update this README documentation
- Test locally
- Submit pull request
Securityโ
- Container runs as non-root user (1001:1001)
- Regular vulnerability scanning via Trivy
- Base image updated automatically via dependabot
- No secrets or credentials in container image
- Network isolation through Kubernetes policies
- PGDG Repository: Uses official PostgreSQL repository with verified GPG signatures
- Build dependencies: Temporary packages (wget, gnupg) removed after build to minimize attack surface
Performance Considerationsโ
- Optimized for development and medium-scale production
- Default settings suitable for 1-4GB RAM
- For high-performance workloads, tune PostgreSQL configuration via ConfigMaps
- Monitor with your existing monitoring stack
Supportโ
For issues related to:
- Container build: Check GitHub Actions logs
- Extension functionality: Refer to upstream documentation
- Urbalurba integration: See main infrastructure documentation
- Performance tuning: Consult PostgreSQL documentation
Version Historyโ
- v1.0.0: Initial release with pgvector, PostGIS, hstore, ltree
- v1.1.0: Added btree_gin and pgcrypto extensions
- v1.2.0: Major improvements and fixes:
- Updated to PostgreSQL 16.x base with PGDG repository
- Fixed package availability issues for PostgreSQL 16
- Enhanced build script with comprehensive testing
- Fixed authentication issues in automated tests
- Added robust container cleanup and error handling
- Improved CI/CD pipeline with realistic multi-architecture testing
- Optimized testing strategy: Native AMD64 (CI) + Native ARM64 (local Mac)
- Added detailed troubleshooting documentation
- Eliminated unreliable emulated testing for faster, more reliable builds