Status banner
Sign in for status

VM Auto-Deploy (Stable + Idempotent)

This runbook documents the canonical VM layout, deploy flow, rollback behavior, and bootstrap steps.

Canonical VM layout

/opt/lucille/
  src/            # git checkout
  env/            # backend.env + worker.env (secrets/config)
  run/            # SAFE_COMMIT, deploy.lock
  logs/           # deploy.log
  venv/
    backend/      # uv venv for API
    worker/       # uv venv for worker
  deploy.env      # optional deploy config

Deploy flow (what happens on each deploy)

  1. git fetch origin main
  2. Hard reset to origin/main:
    • git checkout -B main origin/main
    • git reset --hard origin/main
    • git clean -fd
  3. Ensure clean tree (abort if dirty)
  4. Sync dependencies into /opt/lucille/venv/*
  5. Run migrations
  6. Optional smoke test (DEPLOY_RUN_SMOKE=true)
  7. Stamp APP_VERSION with UTC_TIMESTAMP+SHORT_SHA
  8. Restart lucille-backend and lucille-worker
  9. Write SAFE_COMMIT only after success

Rollback:

  • On failure, checkout SAFE_COMMIT, clean tree, restart services.

Deployment config

/opt/lucille/deploy.env (optional):

ENV_BACKEND_FILE=/opt/lucille/env/backend.env
ENV_WORKER_FILE=/opt/lucille/env/worker.env
DEPLOY_RUN_SMOKE=false
REPO_ROOT=/opt/lucille/src

Bootstrap a fresh VM

sudo apt-get update
sudo apt-get install -y git python3 python3-venv curl
curl -Ls https://astral.sh/uv/install.sh | sh
source ~/.profile

sudo useradd --system --create-home --shell /bin/bash lucille
sudo mkdir -p /opt/lucille
sudo chown -R lucille:lucille /opt/lucille

sudo -u lucille git clone <YOUR_REPO_URL> /opt/lucille/src

cd /opt/lucille/src/backend
uv sync --dev
cd /opt/lucille/src/worker
uv sync

sudo bash /opt/lucille/src/scripts/install_vm_autodeploy.sh

Populate env files:

  • /opt/lucille/env/backend.env
  • /opt/lucille/env/worker.env

Then restart:

sudo systemctl restart lucille-backend lucille-worker

Operational checks

  • Deploy log: tail -f /opt/lucille/logs/deploy.log
  • Timer status: systemctl status lucille-deploy.timer --no-pager -l
  • Current code:
    • git -C /opt/lucille/src log -1 --oneline
    • git -C /opt/lucille/src rev-parse origin/main
  • Health:
    • curl -s http://127.0.0.1:8000/health

Common failure modes + fixes

  • Repo dirty:
    • Check git status under /opt/lucille/src and remove untracked files.
  • Permission errors in .venv:
    • Use external venvs at /opt/lucille/venv/* owned by lucille.
  • Deploy loops:
    • Ensure SAFE_COMMIT matches origin/main: cat /opt/lucille/run/SAFE_COMMIT and compare to git rev-parse origin/main.

Webhook (optional)

  • Configure GitHub webhook to http://<vm-ip>:9009/webhook/github.
  • Set LUCILLE_WEBHOOK_SECRET in /opt/lucille/env/backend.env.
  • Enable:
    sudo systemctl enable --now lucille-deployhook