Skip to content

Operational Runbooks — Engramia

Quick-reference guides for common production issues on api.engramia.dev (Hetzner CX23, root@engramia-staging).

Runbooks

Runbook When to use
disk-full.md df -h shows 100%, API returning 500s
high-latency.md p95 latency > 2s, timeouts reported
deploy-rollback.md Deploy to prod, or roll back a bad release
database-recovery.md PostgreSQL unhealthy, data loss suspected
api-key-rotation.md Key compromise, offboarding, 90-day rotation
maintenance-mode.md Planned downtime, schema migrations
rate-limit-tuning.md Legitimate clients hitting 429s, or cost spikes
certificate-renewal.md TLS cert expiry, HTTPS errors

Common Quick Commands

# Health check
curl https://api.engramia.dev/v1/health

# View live API logs
ssh root@engramia-staging \
  'docker compose -f /opt/engramia/docker-compose.prod.yml logs -f engramia-api'

# Restart API (no downtime for active connections)
ssh root@engramia-staging \
  'docker compose -f /opt/engramia/docker-compose.prod.yml restart engramia-api'

# Check all container statuses
ssh root@engramia-staging \
  'docker compose -f /opt/engramia/docker-compose.prod.yml ps'