Monitoring
Monitor your Synjar deployment for uptime, performance, and issues.
Health endpoint
curl http://localhost:6200/health
Response:
{
"status": "ok",
"timestamp": "2025-01-01T12:00:00Z"
}
Use this endpoint for:
- Load balancer health checks
- Uptime monitoring
- Container orchestration liveness probes
Metrics
Prometheus metrics
If enabled, metrics are available at /metrics:
curl http://localhost:6200/metrics
Key metrics:
http_requests_total- Total HTTP requestshttp_request_duration_seconds- Request latencydb_queries_total- Database queriesdocument_processing_duration- Processing time
Grafana dashboard
Import the Synjar dashboard from our Grafana catalog (ID: 12345).
Logging
Log levels
Set via LOG_LEVEL environment variable:
error- Only errorswarn- Errors and warningsinfo- Standard logging (default)debug- Verbose debugging
Log format
JSON format for easy parsing:
{
"level": "info",
"message": "Document processed",
"documentId": "abc123",
"duration": 1234,
"timestamp": "2025-01-01T12:00:00Z"
}
Viewing logs
# Docker Compose
docker-compose logs -f synjar
# Docker
docker logs -f synjar
# Kubernetes
kubectl logs -f deployment/synjar
Log aggregation
Send logs to centralized logging:
Loki/Grafana:
# docker-compose.override.yml
services:
synjar:
logging:
driver: loki
options:
loki-url: "http://loki:3100/loki/api/v1/push"
Elasticsearch:
services:
filebeat:
image: elastic/filebeat
volumes:
- /var/lib/docker/containers:/var/lib/docker/containers:ro
Alerting
Recommended alerts
| Alert | Condition | Severity |
|---|---|---|
| Service down | Health check fails 3x | Critical |
| High latency | p95 > 2s for 5 min | Warning |
| High error rate | >1% errors for 5 min | Warning |
| Database connection | Connection errors | Critical |
Example Prometheus rules
groups:
- name: synjar
rules:
- alert: SynjarDown
expr: up{job="synjar"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Synjar is down"
- alert: HighLatency
expr: histogram_quantile(0.95, http_request_duration_seconds) > 2
for: 5m
labels:
severity: warning
annotations:
summary: "High request latency"
Uptime monitoring
External monitoring services:
- UptimeRobot
- Pingdom
- StatusCake
- Better Uptime
Monitor:
/healthendpoint- Main application URL
- Critical API endpoints
Performance optimization
Database
Check slow queries:
SELECT query, calls, mean_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;
Application
Monitor for:
- Memory usage trends
- CPU utilization
- Request queue depth