# Docker-Monitoring und Observability --- guideline_type: "technology" scope: "docker-monitoring" audience: ["developers", "devops", "ai-assistants"] last_updated: "2025-09-13" dependencies: ["docker-overview.md", "docker-architecture.md"] related_files: ["docker-compose.yml", "config/monitoring/*", "config/grafana/*", "config/prometheus/*"] ai_context: "Monitoring setup, Prometheus metrics, Grafana dashboards, health checks, and log aggregation" --- ## 📊 Monitoring und Observability ### Prometheus Metrics Alle Services exposieren standardisierte Metrics: ```yaml # Service-Labels für Prometheus Autodiscovery labels: - "prometheus.scrape=true" - "prometheus.port=8080" - "prometheus.path=/actuator/prometheus" - "prometheus.service=${SERVICE_NAME}" ``` > **🤖 AI-Assistant Hinweis:** > Monitoring-Stack Zugriff: > - **Grafana:** http://localhost:3000 (admin/admin) > - **Prometheus:** http://localhost:9090 > - **Metrics-Endpoints:** `/actuator/prometheus` für Spring-Services > - **Health-Checks:** `/actuator/health` für Readiness-Probes ### Grafana Dashboards **Vorgefertigte Dashboards:** - **Infrastructure Overview**: CPU, Memory, Disk, Network - **Spring Boot Services**: JVM Metrics, HTTP Requests, Circuit Breaker - **Database Performance**: PostgreSQL Connections, Query Performance - **Message Queue**: Kafka Consumer Lag, Throughput - **Business Metrics**: Application-spezifische KPIs ### Health Check Matrix | Service | Endpoint | Erwartung | Timeout | |---------|----------|-----------|---------| | API Gateway | `/actuator/health` | `{"status":"UP"}` | 15s | | Ping Service | `/actuator/health/readiness` | HTTP 200 | 3s | | PostgreSQL | `pg_isready` | Connection OK | 5s | | Redis | `redis-cli ping` | PONG | 5s | | Keycloak | `/health/ready` | HTTP 200 | 5s | ### Log Aggregation ```bash # Centralized logging mit ELK Stack (optional) docker-compose -f docker-compose.yml -f docker-compose.logging.yml up -d # Log-Parsing für strukturierte Logs docker-compose logs --follow --tail=100 api-gateway | jq -r '.message' ``` ## 🎯 AI-Assistenten: Monitoring-Schnellreferenz ### Monitoring-URLs - **Grafana Dashboard:** http://localhost:3000 (admin/admin) - **Prometheus Targets:** http://localhost:9090/targets - **Prometheus Metrics:** http://localhost:9090/metrics - **Service Health:** http://localhost:/actuator/health ### Wichtige Metrics | Metric-Typ | Beispiel | Beschreibung | |------------|----------|--------------| | JVM Memory | `jvm_memory_used_bytes` | Speicherverbrauch Java-Services | | HTTP Requests | `http_requests_total` | API-Request-Zähler | | Database Connections | `hikaricp_connections` | Pool-Verbindungen | | Kafka Lag | `kafka_consumer_lag` | Consumer-Verzögerung | | Custom Business | `meldestelle_registrations_total` | Fachliche KPIs | ### Health-Check Befehle ```bash # Alle Services prüfen docker-compose ps # Service-spezifische Health-Checks curl -s http://localhost:8082/actuator/health | jq '.status' curl -s http://localhost:8081/actuator/health | jq '.status' # Infrastructure Health-Checks docker-compose exec postgres pg_isready -U meldestelle -d meldestelle docker-compose exec redis redis-cli ping curl -s http://localhost:8180/health/ready ``` ### Log-Analyse ```bash # Service-Logs in Echtzeit docker-compose logs -f # Error-Logs filtern docker-compose logs | grep ERROR # JSON-Logs strukturiert anzeigen docker-compose logs api-gateway | jq -r '. | select(.level=="ERROR") | .message' # Performance-Logs analysieren docker-compose logs api-gateway | grep -i "took\|duration\|time" ``` ### Dashboard-Setup #### Infrastructure-Dashboard ```json { "dashboard": { "title": "Meldestelle Infrastructure", "panels": [ { "title": "CPU Usage", "targets": [ { "expr": "rate(container_cpu_usage_seconds_total[5m]) * 100" } ] }, { "title": "Memory Usage", "targets": [ { "expr": "container_memory_usage_bytes / container_spec_memory_limit_bytes * 100" } ] } ] } } ``` #### Application-Dashboard ```json { "dashboard": { "title": "Meldestelle Services", "panels": [ { "title": "HTTP Requests/sec", "targets": [ { "expr": "rate(http_requests_total[1m])" } ] }, { "title": "Response Time", "targets": [ { "expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))" } ] } ] } } ``` ### Alerting-Regeln ```yaml # prometheus/alerts.yml groups: - name: meldestelle.rules rules: - alert: ServiceDown expr: up == 0 for: 1m labels: severity: critical annotations: summary: "Service {{ $labels.instance }} is down" - alert: HighMemoryUsage expr: (container_memory_usage_bytes / container_spec_memory_limit_bytes) > 0.8 for: 5m labels: severity: warning annotations: summary: "High memory usage on {{ $labels.instance }}" - alert: DatabaseConnectionsFull expr: hikaricp_connections_active >= hikaricp_connections_max * 0.8 for: 2m labels: severity: warning annotations: summary: "Database connection pool nearly exhausted" ``` ### Monitoring-Wartung ```bash # Prometheus-Konfiguration neu laden curl -X POST http://localhost:9090/-/reload # Grafana-Dashboards exportieren curl -s -H "Authorization: Bearer " \ http://localhost:3000/api/dashboards/uid/ > dashboard_backup.json # Monitoring-Data bereinigen docker-compose exec prometheus rm -rf /prometheus/data docker-compose restart prometheus # Log-Rotation für Monitoring-Services docker-compose exec grafana find /var/log -name "*.log" -exec truncate -s 0 {} \; ``` ### Performance-Tuning ```yaml # prometheus.yml - Optimierte Konfiguration global: scrape_interval: 15s evaluation_interval: 15s rule_files: - "/etc/prometheus/alerts.yml" scrape_configs: - job_name: 'spring-boot' metrics_path: '/actuator/prometheus' static_configs: - targets: ['api-gateway:8080', 'ping-service:8082'] scrape_interval: 10s - job_name: 'infrastructure' static_configs: - targets: ['postgres:5432', 'redis:6379'] scrape_interval: 30s ``` --- **Navigation:** - [Docker-Overview](./docker-overview.md) - Grundlagen und Philosophie - [Docker-Architecture](./docker-architecture.md) - Container-Services und Struktur - [Docker-Development](./docker-development.md) - Entwicklungsworkflow - [Docker-Production](./docker-production.md) - Production-Deployment - [Docker-Troubleshooting](./docker-troubleshooting.md) - Problemlösung