Skip to content

Runbook: Rate Limit Tuning

Current Defaults

Env var Default Applies to
ENGRAMIA_RATE_LIMIT_DEFAULT 60 req/min All endpoints
ENGRAMIA_RATE_LIMIT_EXPENSIVE 10 req/min /evaluate, /compose, /evolve

Symptoms Indicating Limit Too Low

  • Legitimate clients receive 429 Too Many Requests
  • Audit logs show RATE_LIMITED events for known good IPs
  • Users report intermittent failures during batch operations

Symptoms Indicating Limit Too High

  • LLM API costs spike unexpectedly
  • VM CPU/memory saturated during eval bursts

Diagnosing Rate Limit Events

# Find rate-limited IPs and paths in logs
ssh root@engramia-staging \
  'docker compose -f /opt/engramia/docker-compose.prod.yml logs engramia-api \
     --since 1h | grep "Rate limit exceeded"'

# Check audit table (if DB auth mode)
ssh root@engramia-staging \
  'docker compose -f /opt/engramia/docker-compose.prod.yml exec pgvector \
     psql -U engramia -c "
       SELECT ip, path, count(*), max(created_at)
       FROM audit_log
       WHERE event = '"'"'rate_limited'"'"'
       AND created_at > NOW() - INTERVAL '"'"'1 hour'"'"'
       GROUP BY ip, path
       ORDER BY count DESC
       LIMIT 20;"'

Adjusting Limits

Temporary (no restart)

The rate limiter reads env vars at startup only. A restart is required.

Permanent

ssh root@engramia-staging '
  cd /opt/engramia
  # Increase default limit to 120/min, keep expensive at 10/min
  sed -i "s/^ENGRAMIA_RATE_LIMIT_DEFAULT=.*/ENGRAMIA_RATE_LIMIT_DEFAULT=120/" .env
  grep -q ENGRAMIA_RATE_LIMIT_DEFAULT .env || echo "ENGRAMIA_RATE_LIMIT_DEFAULT=120" >> .env
  docker compose -f docker-compose.prod.yml restart engramia-api
'

Per-client exemption (workaround)

The current rate limiter is per-IP. For trusted batch clients, consider: 1. Running them via a dedicated IP with a higher limit, or 2. Moving to the async job API (Prefer: respond-async header) to avoid synchronous rate limiting

Resetting Rate Limit State

The in-memory counter resets on API restart:

ssh root@engramia-staging \
  'docker compose -f /opt/engramia/docker-compose.prod.yml restart engramia-api'

Recommendations

  • Keep ENGRAMIA_RATE_LIMIT_EXPENSIVE ≤ 20 to control LLM costs
  • Alert when 429 rate exceeds 5% of total requests (Prometheus)
  • For multi-instance deployments, replace the in-memory limiter with Redis