Files
meldestelle/infrastructure/monitoring/README-INFRA-MONITORING.md
T
StefanMo b35c4087a2 Fix: Test-Commit für VCS-Integration (MP-8) (#15)
* MP-8 OTHER Implementiere JWT-Authentifizierungs-Filter im Gateway

* Fix(ci): Update upload-artifact action to v4

* Fix(ci): Add start command for Keycloak and failure logs

* Fix(ci): Remove invalid 'command' property from Keycloak service

* Fix(ci): Use KC_DEV_MODE env var to start Keycloak

* Fix(ci): Keycloak service was removed from GitHub Actions services and replaced with a manual docker run step that starts Keycloak with the start-dev command.

* dev(ci): vereinheitliche Keycloak auf 26.4.2; aktiviere Health im CI (MP-8)

* Fix(ci): Stabilize Keycloak startup in integration tests via matrix

- Add `dev-file` Keycloak variant to matrix for stability fallback.
- Improve wait logic and health checks for Keycloak and Postgres.
- Unify Keycloak version to 26.4.2 across codebase.
- Add log dumps on failure.

* Fix(ci): Die betroffene Datei docs/Visionen-Ideen/Infrastruktur-Strategie_DSGVO-Konformität.md endet aktuell mit genau einer leeren Zeile (Zeile 87). Das entspricht der Regel MD047 („Files should end with a single newline character“). Damit ist deine Korrektur korrekt.

* Fix(ci): Repository-wide auto-fix for Markdown files was implemented with a GitHub Actions workflow and a local helper script. EditorConfig and markdownlint ignore files were added to ensure consistent formatting. Instructions for using the auto-fix both via GitHub Actions and locally were provided.

* fix(gradle): build.gradle.kts jsBrowser testTask disabled

* fix(gradle): build.gradle.kts jsBrowser testTask disabled

* Fix(ci): Stabilize integration tests with Keycloak matrix build (MP-8)

Introduces a matrix strategy (`keycloak_db: [postgres, dev-file]`)
in the integration-tests workflow to mitigate flaky Keycloak starts
when using the Postgres service container.

- Adds a `dev-file` Keycloak variant for stability fallback.
- Improves wait logic and health checks for Keycloak/Postgres.
- Unifies Keycloak version to 26.4.2 across codebase (Dockerfile, Compose,
  ADR, README, tests).
- Adds log dumps on failure in CI.
- Ensures `KC_HEALTH_ENABLED=true` is set.
- Updates related documentation (README, Schlachtplan).
- Includes broader Docker SSoT cleanup (versions.toml as source,
  script updates, env file cleanup, validator hardening).

This resolves recurring CI failures related to Keycloak startup and
ensures required checks for PRs (#15) are reliable, while also
improving overall Docker build consistency.

* feat(docs, ci): Implement YouTrack SSoT strategy with Dokka sync (MP-8)

- Add Dokka multi-module Gradle configuration and KDoc style guide.
- Add GitHub Actions workflow (docs-kdoc-sync.yml) and Python script
  (youtrack-sync-kb.py) to sync Dokka GFM output to YouTrack KB.
- Extend front-matter schema (bc, doc_type) and update relevant pages/stubs.
- Adapt CI scripts (validate-frontmatter, check-docs-drift, ci-docs link ignore).
- Update README.md to reference YouTrack KB.

* feat(docs, ci): Implement YouTrack SSoT strategy with Dokka sync (MP-8)

- Add Dokka multi-module Gradle configuration and KDoc style guide.
- Add GitHub Actions workflow (docs-kdoc-sync.yml) and Python script
  (youtrack-sync-kb.py) to sync Dokka GFM output to YouTrack KB.
- Extend front-matter schema (bc, doc_type) and update relevant pages/stubs.
- Adapt CI scripts (validate-frontmatter, check-docs-drift, ci-docs link ignore).
- Update README.md to reference YouTrack KB.

* Fix(ci): Replace OpenAPI validator with Spectral

Replaces the deprecated 'char0n/swagger-editor-validate' action,
which failed due to sandbox issues in GitHub Actions, with the
modern '@stoplight/spectral-cli'.

This ensures robust OpenAPI specification validation without
requiring a headless browser environment. The 'generate-api-docs'
job now depends on the successful completion of the Spectral validation.

Part of resolving CI failures for PR #15 (MP-8).

* Fix(ci): Specify spectral:oas ruleset for OpenAPI validation (MP-8)

* Fix(ci): Remove explicit ruleset argument for Spectral validation (MP-8)

* Fix(ci): Added a .spectral.yaml file to fix Spectral linting errors. Corrected markdown lint issues in two documentation files. Updated README.md with a new guidelines section to fix link validation errors.

* Fix(ci): Markdownlint errors were fixed by adding required blank lines. The Guidelines Validation error was resolved by updating the README.md link. The API Documentation Generator workflow was stabilized by updating paths, tasks, and validation steps.

* Fix(ci): Alle vier fehlerhaften GitHub-Action-Prüfungen wurden behoben. Fehler in der OpenAPI-Spezifikation, Probleme mit der Markdown-Linting-Analyse und Validierungsfehler bei Querverweisen wurden korrigiert. Die README.md enthält nun alle erforderlichen Links zu den Richtlinien.

* Fix(ci): Markdown linting errors in docs/api/README.md were fixed by specifying languages in fenced code blocks. OpenAPI specification errors in documentation.yaml were resolved by correcting example property types to strings. Cross-reference validation errors in README.md were fixed by adding the missing link to project-standards/coding-standards.md.

* Fix(ci): Duplicate heading errors in docs/api/members-api.md were fixed. Cross-reference validation errors for docker-architecture.md were resolved. All originally reported issues passed validation successfully.

* Fix(ci): The markdown heading levels in docs/api/members-api.md were corrected from h5 to h4 to fix linting errors. The missing cross-reference link from technology-guides/docker/docker-development.md to docker-overview.md was added. These fixes resolved the original validation and linting errors causing the process to fail.

* Fix(ci): Duplicate heading warnings in docs/api/members-api.md were resolved. Cross-reference validation for docker-development.md to docker-architecture.md was fixed. A new unrelated warning about docker-production.md was identified but not addressed.

* refactor(ci,docs): Simplify CI pipeline and migrate docs to YouTrack SSoT

BREAKING CHANGE: Documentation structure radically simplified

- Consolidate 9 GitHub Actions workflows into 1 main pipeline (ci-main.yml)
- Remove redundant workflows: ci-docs, markdownlint-autofix, guidelines-validation, api-docs
- Delete documentation migrated to YouTrack: api/, BCs/, Visionen-Ideen/, reference/, now/, overview/
- Keep only ADRs, C4 diagrams, and essential dev guides in repo
- Update README.md with YouTrack KB links
- Create new docs/README.md as documentation gateway
- Relax markdown-lint config for pragmatic developer experience

Kept workflows:
- ssot-guard.yml (Docker SSoT validation)
- docs-kdoc-sync.yml (KDoc → YouTrack sync)
- integration-tests.yml (Integration tests)
- deploy-proxmox.yml (Deployment)
- youtrack-sync.yml (YouTrack integration)

Related: MP-DOCS-001

* refactor(ci,docs): Simplify CI pipeline and migrate docs to YouTrack SSoT

BREAKING CHANGE: Documentation structure radically simplified

- Consolidate 9 GitHub Actions workflows into 1 main pipeline (ci-main.yml)
- Remove redundant workflows: ci-docs, markdownlint-autofix, guidelines-validation, api-docs
- Delete documentation migrated to YouTrack: api/, BCs/, Visionen-Ideen/, reference/, now/, overview/
- Keep only ADRs, C4 diagrams, and essential dev guides in repo
- Update README.md with YouTrack KB links
- Create new docs/README.md as documentation gateway
- Relax markdown-lint config for pragmatic developer experience

Kept workflows:
- ssot-guard.yml (Docker SSoT validation)
- docs-kdoc-sync.yml (KDoc → YouTrack sync)
- integration-tests.yml (Integration tests)
- deploy-proxmox.yml (Deployment)
- youtrack-sync.yml (YouTrack integration)

Related: MP-DOCS-001

* refactor(ci,docs): README.md und einige andere Dokumentationen überarbeitet.
ports-and-urls.md hinzugefügt.
Related: MP-DOCS-001

* refactor(ci,docs): Die Markdownlint-Fehler in README.md und docs/README.md wurden behoben, indem die Überschriftenebenen angepasst, überflüssige Satzzeichen am Ende entfernt und die notwendigen Leerzeilen um Überschriften, Listen, Tabellen und Codeblöcke eingefügt wurden. Das problematische Leerzeichen am Ende in docs/README.md wurde ebenfalls entfernt. Die Dateien entsprechen nun den vorgegebenen Markdownlint-Regeln und sollten die CI-Validierung bestehen.
Related: MP-DOCS-001

* refactor(ci,docs): Docker guideline cross-references were fixed and normalized to lowercase labels. Validation scripts confirmed zero cross-reference warnings and consistent metadata. Documentation was updated with a changelog and enhanced README navigation.
Related: MP-DOCS-001

* refactor(ci,docs): Docker guideline cross-references were fixed and normalized to lowercase labels. Validation scripts confirmed zero cross-reference warnings and consistent metadata. Documentation was updated with a changelog and enhanced README navigation.
Related: MP-DOCS-001

* refactor(ci,docs): Dead links in docs/architecture/adr were fixed by updating URLs to stable sources and adding an ignore pattern for a placeholder link. Specific ADR files had their broken links replaced with valid ones. The markdown-link-check GitHub Action is expected to pass with zero dead links now.
Related: MP-DOCS-001

* refactor(ci,docs): Links in ADR checked
Related: MP-DOCS-001

* refactor(ci,docs): Links in ADR checked
Related: MP-DOCS-001

* refactor(ci,docs): Markdown Regeln ausgebessert
Related: MP-DOCS-001

* refactor(ci,docs): Markdown Regeln ausgebessert
Related: MP-DOCS-001

* refactor(ci,docs): Markdown Regeln ausgebessert
Related: MP-DOCS-001

* Chore: Rerun CI checks with updated branch protection rules
2025-11-07 12:26:33 +01:00

113 lines
5.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Infrastructure/Monitoring Modul Aktuelle Dokumentation (Stand: September 2025)
## Überblick
Das Monitoring-Modul stellt die zentrale Observability-Infrastruktur für alle Services bereit. Es deckt zwei der drei Observability-Säulen ab: Metriken und Distributed Tracing. Ziel ist es, Betriebsdaten konsistent zu erfassen, zu visualisieren und Probleme schnell zu erkennen.
- Metriken: Quantitative Leistungsdaten wie Antwortzeiten, Fehlerraten, JVM- und Systemmetriken.
- Distributed Tracing: Ende-zu-Ende-Verfolgung einer Anfrage über Service-Grenzen hinweg (Trace/Span).
## Aufgabe des Moduls
- Bereitstellung einer einheitlichen Monitoring-Client-Bibliothek für alle Microservices.
- Zentrale Bereitstellung eines Zipkin-Servers zur Sammlung und Visualisierung von Traces.
- Sicherstellung eines konsistenten Prometheus-/Micrometer-Setups über alle Services hinweg.
- Bereitstellung konservativer Default-Properties, die pro Service überschrieben werden können.
## Architektur
Das Modul ist in eine wiederverwendbare Client-Bibliothek und einen zentralen Server aufgeteilt:
```
infrastructure/monitoring/
├── monitoring-client/ # Bibliothek, die jeder Service einbindet
└── monitoring-server/ # Eigenständiger Service, der den Zipkin-Server hostet
```
### monitoring-client
Dies ist eine wiederverwendbare Bibliothek, die von jedem Microservice (z. B. gateway, members, masterdata) eingebunden wird.
- Zweck: Automatische Instrumentierung der Anwendung, Aktivierung von Actuator-/Tracing-Funktionen und Export von Metriken.
- Technologien:
- Spring Boot Actuator: Stellt u. a. den Endpunkt `/actuator/prometheus` bereit.
- Micrometer (Core, Prometheus): Einheitliches Metrik-API, Export in Prometheus-Format.
- Micrometer Tracing (Brave Bridge) + Zipkin Reporter: Erzeugt und sendet Traces (Spans) an Zipkin.
- Vorteile: Keine individuelle Konfiguration in jedem Service nötig; sinnvolle Defaults per AutoConfiguration.
#### AutoConfiguration und Defaults
Die Klasse MonitoringClientAutoConfiguration ist als `@AutoConfiguration` deklariert und aktiviert nur, wenn Actuator/Micrometer am Classpath sind (`@ConditionalOnClass`). Sie lädt `monitoring-defaults.properties` mit niedriger Priorität. Wichtige Defaults:
```
management.endpoints.web.exposure.include=health,info,prometheus
management.tracing.enabled=true
management.tracing.sampling.probability=${TRACING_SAMPLING_PROBABILITY:1.0}
management.observations.http.server.requests.enabled=true
management.info.env.enabled=true
management.zipkin.tracing.endpoint=http://zipkin:9411/api/v2/spans
```
Hinweise:
- Sampling-Rate: In Entwicklung 1.0 (100%). In Produktion sollte dies reduziert werden (z. B. 0.1).
- Endpunkte: Der Prometheus-Scrape-Pfad ist einheitlich `/actuator/prometheus`.
- Überschreibung: Jede Anwendung kann diese Werte über application.yml/-properties anpassen.
### monitoring-server
Eigenständiger Spring-Boot-Service, der den Zipkin-Server hostet. Durch die Zipkin-Server-Abhängigkeit erfolgt die Auto-Konfiguration; eine explizite `@EnableZipkinServer`-Annotation ist nicht erforderlich.
- Zweck: Empfang und UI-Visualisierung von Traces aus allen Services.
- Metriken: Der Server exportiert eigene Metriken (Prometheus-Registry eingebunden), sodass er ebenfalls von Prometheus gescraped werden kann.
## Zusammenspiel im System
1. Jeder Microservice bindet `:infrastructure:monitoring:monitoring-client` ein und exponiert `/actuator/prometheus`; Traces werden an Zipkin gesendet.
2. Der `:infrastructure:monitoring:monitoring-server` empfängt Traces und stellt die Zipkin UI bereit.
3. Prometheus (docker-compose) scraped periodisch alle `/actuator/prometheus`-Endpunkte und speichert Metriken.
4. Grafana (docker-compose) visualisiert Metriken/Dashboards.
Diese Kombination aus Micrometer, Prometheus, Zipkin und Grafana bildet einen gängigen Production-Stack.
## Verwendung in Services
- Abhängigkeit hinzufügen: `implementation(projects.infrastructure.monitoring.monitoringClient)` (über build.gradle.kts der Services).
- Optional: Eigene Tags setzen (z. B. `management.metrics.tags.application`, `environment`).
- Optional: Sampling-Rate via Umgebungsvariable `TRACING_SAMPLING_PROBABILITY` anpassen.
Beispiel (application.yml):
```yaml
management:
metrics:
tags:
application: ${spring.application.name}
environment: ${SPRING_PROFILES_ACTIVE:dev}
tracing:
sampling:
probability: 0.2
```
## Testing-Strategie (Tracer-Bullet-Zyklus)
- Monitoring-Server: Ein grundlegender Smoke-Test prüft erfolgreichen Start der Anwendung.
- Monitoring-Client: Keine dedizierten Unit-Tests; Validierung erfolgt End-to-End durch integrierte Services (Prometheus scrape, Zipkin-Empfang).
## Aktualitäts-Check (Repo-Stand September 2025)
- monitoring-client: AutoConfiguration (Deutsch kommentiert) und Defaults vorhanden; Endpunkt `management.zipkin.tracing.endpoint` verweist auf `zipkin:9411` und entspricht docker-compose.
- monitoring-server: Kotlin-Hauptklasse startet Zipkin-Server; Build-Datei bindet Actuator, Zipkin-Server und Prometheus-Registry ein.
- Gateway application.yml exponiert Actuator-Endpunkte inkl. prometheus; passt zum Monitoring-Ansatz.
## Optimierungen (September 2025)
- Dokumentation präzisiert und vollständig auf Deutsch gebracht; Zweck und Umsetzung klar herausgestellt.
- Kleine Konsistenz-Anpassung: Kommentare im Wurzel-Build-Skript des Monitoring-Moduls ins Deutsche übertragen.
- Empfehlung (optional, nicht zwingend umgesetzt):
- Standard-Tags global setzen (application, environment, instance) via `management.metrics.tags.*`.
- Falls Services eigene MeterRegistry konfigurieren: `@ConditionalOnMissingBean` beachten, um Kollisionen zu vermeiden.
---
Letzte Aktualisierung: 4. September 2025