Skip to content

ADR-004: Terraform for Keycloak Configuration

Status: ✅ Adopted Date: 2025-10 Deciders: DevOps Lead, Backend Team Related: ADR-001, Infrastructure

Context

Our Keycloak setup required manual configuration or JSON realm imports. The workflow was:

  1. Manual Setup: Admin logs into Keycloak UI, clicks through settings
  2. JSON Export: Export realm configuration to realm-export.json
  3. Git Commit: Check JSON into version control
  4. Docker Import: Auto-import on container startup

Problems with JSON Imports

  1. Not Repeatable: JSON exports contain IDs that break on fresh imports
  2. Merge Conflicts: Large JSON files difficult to review in PRs
  3. No Validation: Typos only discovered at runtime
  4. State Drift: Production config diverges from git over time
  5. Team Collaboration: Only one person comfortable editing realm JSON
  6. Documentation: Changes hidden in massive JSON diffs

Decision

Use Terraform to manage Keycloak configuration as Infrastructure as Code (IaC).

All Keycloak realm configuration (realm, roles, clients, service accounts) will be defined in Terraform and applied programmatically.

Rationale

Alternatives Considered

Option 1: Continue with JSON Imports

Pros:

  • No change needed
  • Works for current setup

Cons:

  • Manual exports after every change
  • Brittle ID references
  • Poor review experience
  • No validation until runtime

Verdict: ❌ Rejected - Doesn't scale with team growth

Option 2: Keycloak Admin CLI (kcadm)

Pros:

  • Official Keycloak tool
  • Scriptable

Cons:

  • Bash scripts hard to maintain
  • No state management
  • No plan/preview capability
  • Imperative (not declarative)

Verdict: ❌ Rejected - Too imperative, no state tracking

Option 3: Ansible

Pros:

  • Declarative configuration
  • Good for infrastructure automation

Cons:

  • Heavier than needed
  • Less precise state management than Terraform
  • Team not familiar with Ansible

Verdict: ❌ Rejected - Overkill for our needs

Option 4: Terraform (SELECTED)

Pros:

  • Declarative - Define desired state, Terraform figures out changes
  • Plan/Apply - Preview changes before applying
  • State Management - Tracks what's deployed
  • Version Control - Configuration is code
  • Validation - Catch errors before deployment
  • Documentation - Config files are self-documenting
  • Team Collaboration - Code review on PRs
  • Mature Provider - mrparkers/keycloak well-maintained
  • Consistent - Use Terraform for all infrastructure

Cons:

  • Initial setup time (~4 hours)
  • Team needs to learn Terraform basics
  • Provider updates required occasionally

Verdict:SELECTED

Consequences

Positive

  1. Reproducible Setup: Fresh environment in <5 minutes
  2. Code Review: Changes visible in readable .tf files
  3. Validation: Terraform validates before applying
  4. Documentation: Config files document setup
  5. Team Onboarding: New devs run one script: setup-local-keycloak.sh
  6. No Drift: Production matches git exactly
  7. Safe Changes: terraform plan previews impact

Negative

  1. Learning Curve: Team needs Terraform basics
  2. Provider Dependency: Relies on mrparkers/keycloak provider
  3. State Management: Need to manage terraform.tfstate file
  4. Breaking Changes: Provider updates may require config changes

Mitigations

  1. Documentation: Comprehensive guide in /docs/architecture/infrastructure
  2. Automation: setup-local-keycloak.sh handles everything
  3. Local State: Use local state for dev, remote (GCS) for prod
  4. Version Locking: Pin provider version in config

Implementation

Directory Structure

terraform/keycloak/
├── main.tf                     # Main configuration
├── variables.tf                # Input variables
├── outputs.tf                  # Output values
├── terraform.tfvars.local      # Local dev config
├── terraform.tfvars.production # Production config (not in git)
└── .terraform/                 # Provider cache (gitignored)

Configuration Example

hcl
# terraform/keycloak/main.tf

terraform {
  required_providers {
    keycloak = {
      source  = "mrparkers/keycloak"
      version = "~> 4.3.0"
    }
  }
}

provider "keycloak" {
  client_id = "admin-cli"
  username  = var.keycloak_admin_user
  password  = var.keycloak_admin_password
  url       = var.keycloak_url
}

# Realm
resource "keycloak_realm" "noumaris" {
  realm             = "noumaris"
  enabled           = true
  display_name      = "Noumaris"

  # Registration settings
  registration_allowed           = true
  registration_email_as_username = true
  reset_password_allowed         = true

  # Token settings
  access_token_lifespan = "5m"
  sso_session_idle_timeout = "30m"
  sso_session_max_lifespan = "10h"
}

# Roles
resource "keycloak_role" "superadmin" {
  realm_id    = keycloak_realm.noumaris.id
  name        = "superadmin"
  description = "System-wide administrative access"
}

resource "keycloak_role" "institution_admin" {
  realm_id    = keycloak_realm.noumaris.id
  name        = "institution_admin"
  description = "Institution-level administrative access"
}

# Frontend Client
resource "keycloak_openid_client" "fastapi_frontend" {
  realm_id  = keycloak_realm.noumaris.id
  client_id = "fastapi-frontend"

  enabled                      = true
  access_type                  = "PUBLIC"
  standard_flow_enabled        = true
  implicit_flow_enabled        = false
  direct_access_grants_enabled = true

  valid_redirect_uris = [
    "http://localhost:5173/*",
    "https://app.noumaris.com/*"
  ]

  web_origins = ["+"]
}

# Admin Service Client
resource "keycloak_openid_client" "admin_service" {
  realm_id  = keycloak_realm.noumaris.id
  client_id = "noumaris-admin-service"

  enabled                      = true
  access_type                  = "CONFIDENTIAL"
  service_accounts_enabled     = true
  standard_flow_enabled        = false

  # Service account will have superadmin role
}

Local Development Workflow

bash
# 1. Start Keycloak
docker-compose up -d keycloak

# 2. Run setup script (includes Terraform)
bash scripts/setup-local-keycloak.sh

# Script does:
# - Wait for Keycloak to be ready
# - cd terraform/keycloak/
# - terraform init
# - terraform plan
# - terraform apply -auto-approve

Setup Script

bash
#!/bin/bash
# scripts/setup-local-keycloak.sh

set -e

echo "🔧 Setting up local Keycloak with Terraform..."

# Wait for Keycloak to be ready
echo "⏳ Waiting for Keycloak to start..."
timeout 30s bash -c 'until curl -sf http://localhost:8081/health > /dev/null; do sleep 1; done'
echo "✅ Keycloak is ready"

# Run Terraform
cd terraform/keycloak
echo "🚀 Initializing Terraform..."
terraform init

echo "📋 Planning Terraform changes..."
terraform plan -var-file=terraform.tfvars.local

echo "✨ Applying Terraform configuration..."
terraform apply -auto-approve -var-file=terraform.tfvars.local

echo "✅ Keycloak configuration complete!"
echo "🌐 Access Keycloak at: http://localhost:8081"
echo "👤 Admin credentials: admin / admin"

Production Deployment

bash
# Production uses GCS backend for state
cd terraform/keycloak

# Configure backend
cat > backend.tf <<EOF
terraform {
  backend "gcs" {
    bucket = "noumaris-terraform-state"
    prefix = "keycloak"
  }
}
EOF

# Initialize with backend
terraform init

# Plan and apply
terraform plan -var-file=terraform.tfvars.production
terraform apply -var-file=terraform.tfvars.production

Benefits Realized

Before Terraform

New developer onboarding:

  1. Start Docker Compose
  2. Log into Keycloak admin (admin/admin)
  3. Follow 20-step manual setup guide
  4. Create realm manually
  5. Create 4 roles manually
  6. Create 2 clients manually
  7. Configure redirect URIs
  8. Export realm JSON
  9. Total time: ~45 minutes, error-prone

After Terraform

New developer onboarding:

  1. Start Docker Compose
  2. Run bash scripts/setup-local-keycloak.sh
  3. Total time: ~2 minutes, fully automated

Production Changes

Before: Manual UI changes → hope you got it right → pray it doesn't break After: Code review → terraform plan → review changes → terraform apply

Real-World Example

Scenario: Add New Role for "Senior Admin"

Before (Manual):

  1. Log into Keycloak admin
  2. Navigate to Roles
  3. Click "Add Role"
  4. Fill in name, description
  5. Save
  6. Update JSON export
  7. Commit to git
  8. Hope production gets updated the same way

After (Terraform):

  1. Edit main.tf:
hcl
resource "keycloak_role" "senior_admin" {
  realm_id    = keycloak_realm.noumaris.id
  name        = "senior_admin"
  description = "Senior institution admin with additional privileges"
}
  1. Run terraform plan - see exactly what will change
  2. Create PR - team reviews in GitHub
  3. Merge PR
  4. CI/CD runs terraform apply automatically
  5. Production updated reliably

Challenges Faced

Challenge 1: Provider Version Compatibility

Issue: Keycloak provider major versions have breaking changes Solution: Pin provider version in terraform.required_providers

Challenge 2: Existing State

Issue: Importing existing Keycloak config to Terraform state Solution: Fresh setup, no import needed (small config)

Challenge 3: Secrets Management

Issue: Don't want admin password in git Solution: Use .tfvars.local (gitignored) for dev, env vars for prod

Future Enhancements

  • [ ] Add Terraform for other infrastructure (Cloud Run, Cloud SQL)
  • [ ] Automate Terraform apply in CI/CD
  • [ ] Add Terraform tests with terraform-compliance
  • [ ] Create Terraform modules for reusable config
  • [ ] Add monitoring and alerting via Terraform

Lessons Learned

  1. Start with IaC Early: Easier to start with Terraform than migrate later
  2. State Management Matters: Remote state essential for production
  3. Plan is Your Friend: Always run terraform plan before apply
  4. Version Control Everything: Even .tfvars files (except secrets)
  5. Documentation as Code: Terraform config is self-documenting

References

Internal documentation for Noumaris platform