Skip to content

Monitoring

  • Set ENABLE_METRICS=true (default) so backend/routes.go mounts /metrics.
  • Restrict access via private network, sidecar auth proxy, or scrape from inside the cluster.
curl -s http://localhost:8080/metrics | grep leaflock_

Encryption & Collaboration

leaflock_active_users, leaflock_collaborations_active, leaflock_websocket_connections, leaflock_notes_total.

HTTP Surface

leaflock_http_requests_total, leaflock_http_request_duration_seconds, leaflock_errors_total.

Persistence

leaflock_db_connections_active, leaflock_db_queries_total, leaflock_redis_operations_total.

Backup Runner

leaflock_backups_total, leaflock_backup_duration_seconds, leaflock_backup_size_bytes.

Use these names directly in Prometheus alert rules or Grafana dashboards.

  • rate(leaflock_http_errors_total{component="api"}[10m]) > 5 → elevated error rate.
  • leaflock_websocket_connections == 0 while leaflock_active_users > 0 → collaboration outage.
  • histogram_quantile(0.95, rate(leaflock_backup_duration_seconds_bucket[24h])) > 600 → slow backups.
  • max_over_time(leaflock_backups_total{status="success"}[24h]) == 0 → missed backup window.

Tune thresholds to match your user volume, but keep the dynamics—each rule maps to a real failure mode observed during ops testing.

  • Import docs/grafana-dashboard.json as a starter panel set.
  • Wire Prometheus datasource to the backend service scrapable URL.
  • Add log panels using the structured JSON output from utils.InfoLogger (fields: component, request_id, latency).
  • Empty /metrics response → ensure ENABLE_METRICS=true and that your reverse proxy forwards the path without stripping headers.
  • Missing WebSocket gauges → confirm the load balancer keeps sticky sessions; otherwise connections flip pods and counters reset.
  • Redis metrics absent → backend failed to connect to Redis. Check startup logs for redis errors and verify REDIS_URL.

For backup-specific alerts and restore drills, see /operations/backups.