Monitoring und Alerting sind essenzielle Bestandteile des produktiven Betriebs von ABAP Cloud Anwendungen. Waehrend das ABAP Environment Monitoring die Grundlagen behandelt, fokussiert dieser Artikel auf fortgeschrittene Techniken: Custom Metriken, Health Check Endpoints, Alert-Konfiguration und die Integration externer Monitoring-Tools.
Monitoring-Architektur Uebersicht
Eine vollstaendige Monitoring-Loesung fuer ABAP Cloud umfasst mehrere Schichten:
┌─────────────────────────────────────────────────────────────────────────────┐│ ABAP Cloud Monitoring Architecture ││ ││ ┌────────────────────────────────────────────────────────────────────────┐ ││ │ External Monitoring Tools │ ││ │ │ ││ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ ││ │ │ Datadog │ │ Dynatrace │ │ Splunk │ │ Grafana │ │ ││ │ │ │ │ │ │ │ │ │ │ ││ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ ││ └────────────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌────────────────────────────────────────────────────────────────────────┐ ││ │ SAP Cloud Services │ ││ │ │ ││ │ ┌─────────────────────┐ ┌─────────────────────┐ │ ││ │ │ SAP Cloud ALM │ │ SAP Cloud Logging │ │ ││ │ │ - Operations │ │ - Log Storage │ │ ││ │ │ - Analytics │ │ - Metrics Export │ │ ││ │ │ - Alerting │ │ - Dashboards │ │ ││ │ └─────────────────────┘ └─────────────────────┘ │ ││ └────────────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌────────────────────────────────────────────────────────────────────────┐ ││ │ ABAP Cloud Application │ ││ │ │ ││ │ ┌─────────────────────┐ ┌─────────────────────┐ │ ││ │ │ Custom Metriken │ │ Health Check APIs │ │ ││ │ │ - Business KPIs │ │ - Liveness │ │ ││ │ │ - Performance │ │ - Readiness │ │ ││ │ │ - Error Rates │ │ - Dependency │ │ ││ │ └─────────────────────┘ └─────────────────────┘ │ ││ │ │ ││ │ ┌─────────────────────┐ ┌─────────────────────┐ │ ││ │ │ Application Logs │ │ Business Events │ │ ││ │ │ - BALI Framework │ │ - RAP Events │ │ ││ │ │ - Structured Logs │ │ - Custom Events │ │ ││ │ └─────────────────────┘ └─────────────────────┘ │ ││ └────────────────────────────────────────────────────────────────────────┘ │└─────────────────────────────────────────────────────────────────────────────┘Custom Metriken mit SAP Cloud Logging
Metrik-Typen verstehen
| Metrik-Typ | Beschreibung | Beispiel |
|---|---|---|
| Counter | Zaehlt Ereignisse | Anzahl verarbeiteter Bestellungen |
| Gauge | Momentaner Wert | Aktuelle Queue-Laenge |
| Histogram | Verteilung von Werten | Antwortzeiten |
| Timer | Ausfuehrungsdauer | Verarbeitungszeit pro Batch |
Metriken sammeln und speichern
" Custom Metriken Klasse fuer ABAP CloudCLASS zcl_custom_metrics DEFINITION PUBLIC FINAL CREATE PRIVATE.
PUBLIC SECTION. CLASS-METHODS: get_instance RETURNING VALUE(ro_instance) TYPE REF TO zcl_custom_metrics.
TYPES: BEGIN OF ty_metric, name TYPE string, value TYPE decfloat34, unit TYPE string, dimensions TYPE string_table, timestamp TYPE timestamp, END OF ty_metric, ty_metrics TYPE STANDARD TABLE OF ty_metric WITH EMPTY KEY.
METHODS: " Counter: Zaehlt Ereignisse increment_counter IMPORTING iv_name TYPE string iv_value TYPE i DEFAULT 1 it_labels TYPE string_table OPTIONAL.
METHODS: " Gauge: Setzt aktuellen Wert set_gauge IMPORTING iv_name TYPE string iv_value TYPE decfloat34 iv_unit TYPE string OPTIONAL.
METHODS: " Timer: Misst Ausfuehrungsdauer start_timer IMPORTING iv_name TYPE string RETURNING VALUE(rv_timer_id) TYPE guid_32.
METHODS: stop_timer IMPORTING iv_timer_id TYPE guid_32.
METHODS: " Metriken exportieren (fuer externe Tools) export_metrics RETURNING VALUE(rv_json) TYPE string.
METHODS: " Metriken in Application Log schreiben flush_to_log RAISING cx_bali_runtime.
PRIVATE SECTION. CLASS-DATA: go_instance TYPE REF TO zcl_custom_metrics.
DATA: mt_metrics TYPE ty_metrics, mt_timers TYPE HASHED TABLE OF timestamp WITH UNIQUE KEY table_line.
TYPES: BEGIN OF ty_timer_entry, timer_id TYPE guid_32, name TYPE string, start_ts TYPE timestamp, END OF ty_timer_entry.
DATA: mt_active_timers TYPE HASHED TABLE OF ty_timer_entry WITH UNIQUE KEY timer_id.
ENDCLASS.
CLASS zcl_custom_metrics IMPLEMENTATION.
METHOD get_instance. IF go_instance IS NOT BOUND. go_instance = NEW zcl_custom_metrics( ). ENDIF. ro_instance = go_instance. ENDMETHOD.
METHOD increment_counter. GET TIME STAMP FIELD DATA(lv_timestamp).
" Bestehenden Counter suchen oder neuen erstellen DATA(lv_found) = abap_false. LOOP AT mt_metrics ASSIGNING FIELD-SYMBOL(<ls_metric>) WHERE name = iv_name. <ls_metric>-value = <ls_metric>-value + iv_value. <ls_metric>-timestamp = lv_timestamp. lv_found = abap_true. EXIT. ENDLOOP.
IF lv_found = abap_false. APPEND VALUE #( name = iv_name value = iv_value unit = 'count' dimensions = it_labels timestamp = lv_timestamp ) TO mt_metrics. ENDIF. ENDMETHOD.
METHOD set_gauge. GET TIME STAMP FIELD DATA(lv_timestamp).
" Gauge immer ueberschreiben READ TABLE mt_metrics ASSIGNING FIELD-SYMBOL(<ls_metric>) WITH KEY name = iv_name.
IF sy-subrc = 0. <ls_metric>-value = iv_value. <ls_metric>-unit = COND #( WHEN iv_unit IS NOT INITIAL THEN iv_unit ELSE <ls_metric>-unit ). <ls_metric>-timestamp = lv_timestamp. ELSE. APPEND VALUE #( name = iv_name value = iv_value unit = COND #( WHEN iv_unit IS NOT INITIAL THEN iv_unit ELSE 'unit' ) timestamp = lv_timestamp ) TO mt_metrics. ENDIF. ENDMETHOD.
METHOD start_timer. " Timer-ID generieren TRY. rv_timer_id = cl_system_uuid=>create_uuid_x16_static( ). CATCH cx_uuid_error. rv_timer_id = |TIMER_{ sy-datum }{ sy-uzeit }|. ENDTRY.
GET TIME STAMP FIELD DATA(lv_start).
INSERT VALUE #( timer_id = rv_timer_id name = iv_name start_ts = lv_start ) INTO TABLE mt_active_timers. ENDMETHOD.
METHOD stop_timer. READ TABLE mt_active_timers INTO DATA(ls_timer) WITH KEY timer_id = iv_timer_id.
IF sy-subrc = 0. GET TIME STAMP FIELD DATA(lv_end).
" Dauer in Millisekunden berechnen DATA(lv_duration_sec) = cl_abap_tstmp=>subtract( tstmp1 = lv_end tstmp2 = ls_timer-start_ts ). DATA(lv_duration_ms) = lv_duration_sec * 1000.
" Als Metrik speichern APPEND VALUE #( name = |{ ls_timer-name }_duration_ms| value = lv_duration_ms unit = 'milliseconds' timestamp = lv_end ) TO mt_metrics.
DELETE mt_active_timers WHERE timer_id = iv_timer_id. ENDIF. ENDMETHOD.
METHOD export_metrics. " Metriken als JSON exportieren (kompatibel mit Prometheus/Datadog) DATA(lt_export) = VALUE string_table( ).
LOOP AT mt_metrics INTO DATA(ls_metric). DATA(lv_dimensions) = concat_lines_of( table = ls_metric-dimensions sep = ',' ).
APPEND |{{ | && |"name": "{ ls_metric-name }", | && |"value": { ls_metric-value }, | && |"unit": "{ ls_metric-unit }", | && |"dimensions": "{ lv_dimensions }", | && |"timestamp": { ls_metric-timestamp } | && |}}| TO lt_export. ENDLOOP.
rv_json = |[{ concat_lines_of( table = lt_export sep = ',' ) }]|. ENDMETHOD.
METHOD flush_to_log. " Metriken in Application Log schreiben DATA(lo_log) = cl_bali_log=>create_with_header( header = cl_bali_header_setter=>create( object = 'ZMETRICS' subobject = 'CUSTOM' external_id = |METRICS_{ sy-datum }_{ sy-uzeit }| ) ).
LOOP AT mt_metrics INTO DATA(ls_metric). lo_log->add_item( cl_bali_message_setter=>create( severity = if_bali_constants=>c_severity_status text = |{ ls_metric-name }: { ls_metric-value } { ls_metric-unit }| ) ). ENDLOOP.
lo_log->save( ).
" Metriken nach Export leeren CLEAR mt_metrics. ENDMETHOD.
ENDCLASS.Metriken im Code verwenden
" Beispiel: Metriken in einer RAP-Anwendung sammelnCLASS zcl_order_processor DEFINITION PUBLIC FINAL CREATE PUBLIC.
PUBLIC SECTION. METHODS: process_orders IMPORTING it_orders TYPE ty_order_tab RAISING cx_failed.
ENDCLASS.
CLASS zcl_order_processor IMPLEMENTATION.
METHOD process_orders. DATA(lo_metrics) = zcl_custom_metrics=>get_instance( ).
" Timer starten DATA(lv_timer) = lo_metrics->start_timer( 'order_processing' ).
" Verarbeitung DATA(lv_success_count) = 0. DATA(lv_error_count) = 0.
LOOP AT it_orders INTO DATA(ls_order). TRY. " Bestellung verarbeiten process_single_order( ls_order ). lv_success_count = lv_success_count + 1.
" Counter erhoehen lo_metrics->increment_counter( iv_name = 'orders_processed_total' it_labels = VALUE #( ( |status:success| ) ( |type:{ ls_order-type }| ) ) ).
CATCH cx_root INTO DATA(lx_error). lv_error_count = lv_error_count + 1.
lo_metrics->increment_counter( iv_name = 'orders_processed_total' it_labels = VALUE #( ( |status:error| ) ( |type:{ ls_order-type }| ) ) ). ENDTRY. ENDLOOP.
" Timer stoppen lo_metrics->stop_timer( lv_timer ).
" Aktuelle Queue-Laenge als Gauge DATA(lv_pending) = get_pending_order_count( ). lo_metrics->set_gauge( iv_name = 'orders_pending_queue' iv_value = lv_pending iv_unit = 'orders' ).
" Error Rate als Gauge DATA(lv_error_rate) = COND decfloat34( WHEN lines( it_orders ) > 0 THEN lv_error_count * 100 / lines( it_orders ) ELSE 0 ). lo_metrics->set_gauge( iv_name = 'orders_error_rate_percent' iv_value = lv_error_rate iv_unit = 'percent' ).
" Metriken in Log schreiben TRY. lo_metrics->flush_to_log( ). CATCH cx_bali_runtime. " Log-Fehler ignorieren ENDTRY. ENDMETHOD.
ENDCLASS.Health Check Endpoints
Health Checks sind essenziell fuer Container-Orchestrierung und Load Balancer. In ABAP Cloud implementieren wir diese als OData Service oder HTTP Handler.
Health Check Typen
| Check-Typ | Zweck | Frequenz |
|---|---|---|
| Liveness | Anwendung laeuft | Alle 10-30 Sekunden |
| Readiness | Anwendung kann Requests verarbeiten | Alle 5-10 Sekunden |
| Startup | Anwendung ist gestartet | Einmalig beim Start |
| Deep | Alle Abhaengigkeiten verfuegbar | Alle 1-5 Minuten |
Health Check Service implementieren
" HTTP Handler fuer Health ChecksCLASS zcl_health_check_handler DEFINITION PUBLIC FINAL CREATE PUBLIC.
INTERFACES if_http_service_extension.
PUBLIC SECTION. TYPES: BEGIN OF ty_health_status, status TYPE string, timestamp TYPE timestamp, version TYPE string, checks TYPE string_table, END OF ty_health_status.
TYPES: BEGIN OF ty_check_result, name TYPE string, status TYPE string, message TYPE string, latency TYPE i, END OF ty_check_result, ty_check_results TYPE STANDARD TABLE OF ty_check_result WITH EMPTY KEY.
PRIVATE SECTION. METHODS: check_liveness RETURNING VALUE(rs_result) TYPE ty_check_result.
METHODS: check_readiness RETURNING VALUE(rs_result) TYPE ty_check_result.
METHODS: check_database RETURNING VALUE(rs_result) TYPE ty_check_result.
METHODS: check_external_service IMPORTING iv_service_name TYPE string RETURNING VALUE(rs_result) TYPE ty_check_result.
METHODS: run_all_checks RETURNING VALUE(rt_results) TYPE ty_check_results.
METHODS: build_response IMPORTING it_checks TYPE ty_check_results RETURNING VALUE(rv_json) TYPE string.
ENDCLASS.
CLASS zcl_health_check_handler IMPLEMENTATION.
METHOD if_http_service_extension~handle_request. " Request-Pfad auswerten DATA(lv_path) = request->get_header_field( '~path_info' ).
DATA(lt_checks) = VALUE ty_check_results( ). DATA(lv_http_status) = 200.
CASE lv_path. WHEN '/health/live' OR '/health/liveness'. " Nur Liveness Check APPEND check_liveness( ) TO lt_checks.
WHEN '/health/ready' OR '/health/readiness'. " Readiness: Liveness + Datenbank APPEND check_liveness( ) TO lt_checks. APPEND check_database( ) TO lt_checks.
WHEN '/health/deep' OR '/health'. " Alle Checks lt_checks = run_all_checks( ).
WHEN OTHERS. " Unbekannter Pfad response->set_status( code = 404 reason = 'Not Found' ). response->set_text( '{"error": "Unknown health endpoint"}' ). RETURN. ENDCASE.
" Gesamtstatus ermitteln LOOP AT lt_checks INTO DATA(ls_check) WHERE status <> 'healthy'. lv_http_status = 503. " Service Unavailable EXIT. ENDLOOP.
" Response bauen DATA(lv_json) = build_response( lt_checks ).
response->set_status( code = lv_http_status reason = COND #( WHEN lv_http_status = 200 THEN 'OK' ELSE 'Service Unavailable' ) ). response->set_header_field( name = 'Content-Type' value = 'application/json' ). response->set_text( lv_json ). ENDMETHOD.
METHOD check_liveness. " Einfacher Check: Anwendung laeuft rs_result-name = 'liveness'. rs_result-status = 'healthy'. rs_result-message = 'Application is running'. rs_result-latency = 0. ENDMETHOD.
METHOD check_readiness. " Pruefen ob Anwendung bereit ist rs_result-name = 'readiness'.
TRY. " Beispiel: Konfiguration geladen? DATA(lv_config) = zcl_config_manager=>get_instance( )->is_initialized( ).
IF lv_config = abap_true. rs_result-status = 'healthy'. rs_result-message = 'Application is ready'. ELSE. rs_result-status = 'unhealthy'. rs_result-message = 'Configuration not loaded'. ENDIF.
CATCH cx_root INTO DATA(lx_error). rs_result-status = 'unhealthy'. rs_result-message = lx_error->get_text( ). ENDTRY. ENDMETHOD.
METHOD check_database. " Datenbank-Konnektivitaet pruefen rs_result-name = 'database'.
DATA(lv_start) = sy-uzeit.
TRY. " Einfache Query ausfuehren SELECT SINGLE @abap_true FROM t000 INTO @DATA(lv_result).
IF sy-subrc = 0. rs_result-status = 'healthy'. rs_result-message = 'Database connection OK'. ELSE. rs_result-status = 'unhealthy'. rs_result-message = 'Database query returned no result'. ENDIF.
CATCH cx_sy_open_sql_db INTO DATA(lx_db). rs_result-status = 'unhealthy'. rs_result-message = |Database error: { lx_db->get_text( ) }|. ENDTRY.
" Latenz berechnen (vereinfacht) DATA(lv_end) = sy-uzeit. rs_result-latency = lv_end - lv_start. ENDMETHOD.
METHOD check_external_service. " Externen Service pruefen rs_result-name = iv_service_name.
TRY. " HTTP Call zum externen Service DATA(lo_destination) = cl_http_destination_provider=>create_by_cloud_destination( i_name = iv_service_name ).
DATA(lo_client) = cl_web_http_client_manager=>create_by_http_destination( i_destination = lo_destination ).
DATA(lo_request) = lo_client->get_http_request( ). lo_request->set_uri_path( '/health' ).
DATA(lo_response) = lo_client->execute( if_web_http_client=>get ). DATA(lv_status) = lo_response->get_status( )-code.
lo_client->close( ).
IF lv_status = 200. rs_result-status = 'healthy'. rs_result-message = |{ iv_service_name } is available|. ELSE. rs_result-status = 'degraded'. rs_result-message = |{ iv_service_name } returned status { lv_status }|. ENDIF.
CATCH cx_root INTO DATA(lx_error). rs_result-status = 'unhealthy'. rs_result-message = |{ iv_service_name }: { lx_error->get_text( ) }|. ENDTRY. ENDMETHOD.
METHOD run_all_checks. " Alle Health Checks ausfuehren APPEND check_liveness( ) TO rt_results. APPEND check_readiness( ) TO rt_results. APPEND check_database( ) TO rt_results.
" Externe Services pruefen (aus Konfiguration) DATA(lt_services) = VALUE string_table( ( 'S4_BACKEND' ) ( 'DOCUMENT_SERVICE' ) ).
LOOP AT lt_services INTO DATA(lv_service). APPEND check_external_service( lv_service ) TO rt_results. ENDLOOP. ENDMETHOD.
METHOD build_response. " JSON Response bauen DATA(lv_overall_status) = 'healthy'. DATA(lt_check_json) = VALUE string_table( ).
LOOP AT it_checks INTO DATA(ls_check). IF ls_check-status <> 'healthy'. lv_overall_status = ls_check-status. ENDIF.
APPEND |{{ "name": "{ ls_check-name }", | && |"status": "{ ls_check-status }", | && |"message": "{ ls_check-message }", | && |"latency_ms": { ls_check-latency } }}| TO lt_check_json. ENDLOOP.
GET TIME STAMP FIELD DATA(lv_timestamp).
rv_json = |{{ | && |"status": "{ lv_overall_status }", | && |"timestamp": "{ lv_timestamp }", | && |"checks": [{ concat_lines_of( table = lt_check_json sep = ',' ) }] | && |}}|. ENDMETHOD.
ENDCLASS.Health Check Response Beispiel
{ "status": "healthy", "timestamp": "20260214143022", "checks": [ { "name": "liveness", "status": "healthy", "message": "Application is running", "latency_ms": 0 }, { "name": "database", "status": "healthy", "message": "Database connection OK", "latency_ms": 5 }, { "name": "S4_BACKEND", "status": "healthy", "message": "S4_BACKEND is available", "latency_ms": 120 } ]}Alert Rules konfigurieren
Alert-Konzept
Alerts sollten nach Dringlichkeit klassifiziert werden:
| Severity | Beschreibung | Reaktionszeit | Benachrichtigung |
|---|---|---|---|
| Critical | System nicht verfuegbar | Sofort | PagerDuty, SMS |
| High | Funktionalitaet eingeschraenkt | < 1 Stunde | Email + Slack |
| Medium | Performance-Probleme | < 4 Stunden | |
| Low | Informativ | Naechster Arbeitstag | Ticket |
Alert Manager implementieren
" Alert Manager fuer ABAP CloudCLASS zcl_alert_manager DEFINITION PUBLIC FINAL CREATE PRIVATE.
PUBLIC SECTION. CLASS-METHODS: get_instance RETURNING VALUE(ro_instance) TYPE REF TO zcl_alert_manager.
TYPES: BEGIN OF ENUM ty_severity STRUCTURE severity, critical, high, medium, low, END OF ENUM ty_severity STRUCTURE severity.
TYPES: BEGIN OF ty_alert, id TYPE guid_32, name TYPE string, severity TYPE ty_severity, message TYPE string, source TYPE string, timestamp TYPE timestamp, resolved TYPE abap_bool, resolved_at TYPE timestamp, END OF ty_alert, ty_alerts TYPE STANDARD TABLE OF ty_alert WITH KEY id.
TYPES: BEGIN OF ty_alert_rule, name TYPE string, condition TYPE string, threshold TYPE decfloat34, severity TYPE ty_severity, cooldown TYPE i, " Sekunden zwischen Alerts enabled TYPE abap_bool, END OF ty_alert_rule, ty_alert_rules TYPE STANDARD TABLE OF ty_alert_rule WITH KEY name.
METHODS: " Alert ausloesen fire_alert IMPORTING iv_name TYPE string iv_severity TYPE ty_severity iv_message TYPE string iv_source TYPE string OPTIONAL.
METHODS: " Alert aufloesen resolve_alert IMPORTING iv_alert_id TYPE guid_32.
METHODS: " Aktive Alerts abrufen get_active_alerts RETURNING VALUE(rt_alerts) TYPE ty_alerts.
METHODS: " Metrik pruefen und ggf. Alert ausloesen check_threshold IMPORTING iv_metric_name TYPE string iv_value TYPE decfloat34.
METHODS: " Alert-Regel registrieren register_rule IMPORTING is_rule TYPE ty_alert_rule.
PRIVATE SECTION. CLASS-DATA: go_instance TYPE REF TO zcl_alert_manager.
DATA: mt_active_alerts TYPE ty_alerts, mt_rules TYPE ty_alert_rules, mt_last_fired TYPE HASHED TABLE OF timestamp WITH UNIQUE KEY table_line.
METHODS: send_notification IMPORTING is_alert TYPE ty_alert.
METHODS: log_alert IMPORTING is_alert TYPE ty_alert.
ENDCLASS.
CLASS zcl_alert_manager IMPLEMENTATION.
METHOD get_instance. IF go_instance IS NOT BOUND. go_instance = NEW zcl_alert_manager( ). ENDIF. ro_instance = go_instance. ENDMETHOD.
METHOD fire_alert. " Alert-ID generieren DATA(lv_alert_id) = CONV guid_32( cl_system_uuid=>create_uuid_x16_static( ) ).
GET TIME STAMP FIELD DATA(lv_timestamp).
DATA(ls_alert) = VALUE ty_alert( id = lv_alert_id name = iv_name severity = iv_severity message = iv_message source = COND #( WHEN iv_source IS NOT INITIAL THEN iv_source ELSE sy-repid ) timestamp = lv_timestamp resolved = abap_false ).
" Zu aktiven Alerts hinzufuegen APPEND ls_alert TO mt_active_alerts.
" Alert loggen log_alert( ls_alert ).
" Benachrichtigung senden (abhaengig von Severity) send_notification( ls_alert ). ENDMETHOD.
METHOD resolve_alert. READ TABLE mt_active_alerts ASSIGNING FIELD-SYMBOL(<ls_alert>) WITH KEY id = iv_alert_id.
IF sy-subrc = 0. GET TIME STAMP FIELD <ls_alert>-resolved_at. <ls_alert>-resolved = abap_true.
" Optional: Resolved-Benachrichtigung DATA(ls_resolved_alert) = <ls_alert>. ls_resolved_alert-message = |RESOLVED: { <ls_alert>-message }|. send_notification( ls_resolved_alert ). ENDIF. ENDMETHOD.
METHOD get_active_alerts. rt_alerts = FILTER #( mt_active_alerts WHERE resolved = abap_false ). ENDMETHOD.
METHOD check_threshold. " Regel fuer Metrik suchen READ TABLE mt_rules INTO DATA(ls_rule) WITH KEY name = iv_metric_name.
IF sy-subrc <> 0 OR ls_rule-enabled = abap_false. RETURN. ENDIF.
" Schwellwert pruefen DATA(lv_exceeded) = abap_false.
CASE ls_rule-condition. WHEN 'GT'. lv_exceeded = xsdbool( iv_value > ls_rule-threshold ). WHEN 'GE'. lv_exceeded = xsdbool( iv_value >= ls_rule-threshold ). WHEN 'LT'. lv_exceeded = xsdbool( iv_value < ls_rule-threshold ). WHEN 'LE'. lv_exceeded = xsdbool( iv_value <= ls_rule-threshold ). WHEN 'EQ'. lv_exceeded = xsdbool( iv_value = ls_rule-threshold ). ENDCASE.
IF lv_exceeded = abap_true. " Cooldown pruefen " (Implementierung vereinfacht - in Produktion mit Tabelle/Cache)
fire_alert( iv_name = iv_metric_name iv_severity = ls_rule-severity iv_message = |{ iv_metric_name }: { iv_value } { ls_rule-condition } { ls_rule-threshold }| ). ENDIF. ENDMETHOD.
METHOD register_rule. " Bestehende Regel aktualisieren oder neue hinzufuegen READ TABLE mt_rules ASSIGNING FIELD-SYMBOL(<ls_rule>) WITH KEY name = is_rule-name.
IF sy-subrc = 0. <ls_rule> = is_rule. ELSE. APPEND is_rule TO mt_rules. ENDIF. ENDMETHOD.
METHOD send_notification. " Benachrichtigung basierend auf Severity senden CASE is_alert-severity. WHEN severity-critical. " PagerDuty / SMS " HTTP Call an PagerDuty API " send_pagerduty_alert( is_alert ).
WHEN severity-high. " Email + Slack " send_slack_notification( is_alert ). " send_email( is_alert ).
WHEN severity-medium. " Nur Email " send_email( is_alert ).
WHEN severity-low. " Nur Log (bereits erfolgt) ENDCASE.
" Zusaetzlich: SAP Cloud ALM Integration " (wenn konfiguriert) ENDMETHOD.
METHOD log_alert. " Alert in Application Log schreiben TRY. DATA(lo_log) = cl_bali_log=>create_with_header( header = cl_bali_header_setter=>create( object = 'ZALERT' subobject = CONV #( is_alert-severity ) external_id = |ALERT_{ is_alert-id }| ) ).
DATA(lv_severity) = SWITCH if_bali_constants=>ty_severity( is_alert-severity WHEN severity-critical THEN if_bali_constants=>c_severity_error WHEN severity-high THEN if_bali_constants=>c_severity_error WHEN severity-medium THEN if_bali_constants=>c_severity_warning WHEN severity-low THEN if_bali_constants=>c_severity_status ).
lo_log->add_item( cl_bali_message_setter=>create( severity = lv_severity text = |[{ is_alert-name }] { is_alert-message }| ) ).
lo_log->save( ).
CATCH cx_bali_runtime. " Log-Fehler ignorieren ENDTRY. ENDMETHOD.
ENDCLASS.Alert-Regeln konfigurieren
" Alert-Regeln beim Systemstart registrierenCLASS zcl_alert_config DEFINITION PUBLIC FINAL CREATE PUBLIC.
PUBLIC SECTION. CLASS-METHODS: initialize.
ENDCLASS.
CLASS zcl_alert_config IMPLEMENTATION.
METHOD initialize. DATA(lo_alerts) = zcl_alert_manager=>get_instance( ).
" CPU-Auslastung lo_alerts->register_rule( VALUE #( name = 'cpu_usage_percent' condition = 'GT' threshold = 85 severity = zcl_alert_manager=>severity-high cooldown = 300 " 5 Minuten enabled = abap_true ) ).
" Memory-Auslastung lo_alerts->register_rule( VALUE #( name = 'memory_usage_percent' condition = 'GT' threshold = 90 severity = zcl_alert_manager=>severity-critical cooldown = 60 " 1 Minute enabled = abap_true ) ).
" Error Rate lo_alerts->register_rule( VALUE #( name = 'orders_error_rate_percent' condition = 'GT' threshold = 5 severity = zcl_alert_manager=>severity-medium cooldown = 600 " 10 Minuten enabled = abap_true ) ).
" Queue-Laenge lo_alerts->register_rule( VALUE #( name = 'orders_pending_queue' condition = 'GT' threshold = 1000 severity = zcl_alert_manager=>severity-high cooldown = 180 " 3 Minuten enabled = abap_true ) ).
" Response Time lo_alerts->register_rule( VALUE #( name = 'api_response_time_ms' condition = 'GT' threshold = 2000 " 2 Sekunden severity = zcl_alert_manager=>severity-medium cooldown = 120 " 2 Minuten enabled = abap_true ) ). ENDMETHOD.
ENDCLASS.Dashboard erstellen
Monitoring CDS View
-- Monitoring Dashboard View@AbapCatalog.sqlViewName: 'ZV_MONITOR_DASH'@Analytics.dataCategory: #CUBE@AccessControl.authorizationCheck: #NOT_REQUIREDdefine view entity ZI_MonitoringDashboard as select from zbali_log as Log{ key Log.log_handle as LogHandle, Log.external_id as ExternalId, Log.object as LogObject, Log.subobject as LogSubobject,
@Semantics.systemDateTime.createdAt: true Log.log_timestamp as LogTimestamp,
cast(substring(Log.log_timestamp, 1, 8) as abap.dats) as LogDate,
cast(substring(Log.log_timestamp, 9, 6) as abap.tims) as LogTime,
case Log.max_severity when 'E' then 'Error' when 'W' then 'Warning' when 'I' then 'Info' when 'S' then 'Success' else 'Unknown' end as SeverityText,
case Log.max_severity when 'E' then 1 when 'W' then 2 when 'I' then 3 when 'S' then 4 else 5 end as SeverityOrder,
-- Aggregationen @Aggregation.default: #COUNT cast(1 as abap.int4) as LogCount,
@Aggregation.default: #SUM case Log.max_severity when 'E' then 1 else 0 end as ErrorCount,
@Aggregation.default: #SUM case Log.max_severity when 'W' then 1 else 0 end as WarningCount}where Log.log_timestamp >= $session.system_date - 7Dashboard Report Klasse
" Dashboard-Daten fuer UI bereitstellenCLASS zcl_monitoring_dashboard DEFINITION PUBLIC FINAL CREATE PUBLIC.
PUBLIC SECTION. TYPES: BEGIN OF ty_dashboard_data, total_requests TYPE i, error_count TYPE i, warning_count TYPE i, avg_response_ms TYPE decfloat34, uptime_percent TYPE decfloat34, active_alerts TYPE i, last_updated TYPE timestamp, END OF ty_dashboard_data.
TYPES: BEGIN OF ty_trend_point, timestamp TYPE timestamp, value TYPE decfloat34, END OF ty_trend_point, ty_trend_data TYPE STANDARD TABLE OF ty_trend_point WITH EMPTY KEY.
METHODS: get_dashboard_data RETURNING VALUE(rs_data) TYPE ty_dashboard_data.
METHODS: get_error_trend IMPORTING iv_hours TYPE i DEFAULT 24 RETURNING VALUE(rt_data) TYPE ty_trend_data.
METHODS: get_response_time_trend IMPORTING iv_hours TYPE i DEFAULT 24 RETURNING VALUE(rt_data) TYPE ty_trend_data.
ENDCLASS.
CLASS zcl_monitoring_dashboard IMPLEMENTATION.
METHOD get_dashboard_data. GET TIME STAMP FIELD rs_data-last_updated.
" Logs der letzten 24 Stunden aggregieren SELECT COUNT(*) AS total, SUM( CASE max_severity WHEN 'E' THEN 1 ELSE 0 END ) AS errors, SUM( CASE max_severity WHEN 'W' THEN 1 ELSE 0 END ) AS warnings FROM zbali_log WHERE log_timestamp >= @( rs_data-last_updated - 86400 ) INTO ( @rs_data-total_requests, @rs_data-error_count, @rs_data-warning_count ).
" Aktive Alerts zaehlen DATA(lt_alerts) = zcl_alert_manager=>get_instance( )->get_active_alerts( ). rs_data-active_alerts = lines( lt_alerts ).
" Uptime berechnen (vereinfacht: 100% - Error%) rs_data-uptime_percent = COND #( WHEN rs_data-total_requests > 0 THEN 100 - ( rs_data-error_count * 100 / rs_data-total_requests ) ELSE 100 ).
" Durchschnittliche Response-Zeit aus Metriken " (Implementierung abhaengig von Metrik-Speicherung) rs_data-avg_response_ms = 150. " Beispielwert ENDMETHOD.
METHOD get_error_trend. " Fehler-Trend pro Stunde DATA(lv_hours_ago) = iv_hours.
GET TIME STAMP FIELD DATA(lv_now). DATA(lv_start) = lv_now - ( iv_hours * 3600 ).
" Aggregation pro Stunde SELECT CAST( LEFT( log_timestamp, 10 ) AS ABAP.CHAR( 10 ) ) AS hour_ts, COUNT(*) AS error_count FROM zbali_log WHERE log_timestamp >= @lv_start AND max_severity = 'E' GROUP BY LEFT( log_timestamp, 10 ) ORDER BY hour_ts INTO TABLE @DATA(lt_hourly).
LOOP AT lt_hourly INTO DATA(ls_hour). APPEND VALUE #( timestamp = CONV timestamp( ls_hour-hour_ts && '000000' ) value = ls_hour-error_count ) TO rt_data. ENDLOOP. ENDMETHOD.
METHOD get_response_time_trend. " Response-Zeit Trend " (Implementierung abhaengig von Metrik-Speicherung) " Beispiel: Daten aus Custom-Metrik-Tabelle lesen ENDMETHOD.
ENDCLASS.Integration mit externen Tools
Datadog Integration
Datadog ist ein beliebtes Cloud-Monitoring-Tool. Die Integration erfolgt ueber die Datadog API.
" Datadog Integration fuer ABAP CloudCLASS zcl_datadog_exporter DEFINITION PUBLIC FINAL CREATE PUBLIC.
PUBLIC SECTION. METHODS: constructor IMPORTING iv_api_key TYPE string iv_app_key TYPE string OPTIONAL iv_site TYPE string DEFAULT 'datadoghq.com'.
METHODS: send_metric IMPORTING iv_name TYPE string iv_value TYPE decfloat34 iv_type TYPE string DEFAULT 'gauge' " gauge, count, rate it_tags TYPE string_table OPTIONAL RAISING cx_web_http_client_error.
METHODS: send_event IMPORTING iv_title TYPE string iv_text TYPE string iv_type TYPE string DEFAULT 'info' " info, warning, error it_tags TYPE string_table OPTIONAL RAISING cx_web_http_client_error.
METHODS: send_log IMPORTING iv_message TYPE string iv_level TYPE string DEFAULT 'INFO' iv_service TYPE string DEFAULT 'abap-cloud' it_tags TYPE string_table OPTIONAL RAISING cx_web_http_client_error.
PRIVATE SECTION. DATA: mv_api_key TYPE string, mv_app_key TYPE string, mv_site TYPE string.
METHODS: send_request IMPORTING iv_endpoint TYPE string iv_body TYPE string RAISING cx_web_http_client_error.
ENDCLASS.
CLASS zcl_datadog_exporter IMPLEMENTATION.
METHOD constructor. mv_api_key = iv_api_key. mv_app_key = iv_app_key. mv_site = iv_site. ENDMETHOD.
METHOD send_metric. " Datadog Metrics API Format GET TIME STAMP FIELD DATA(lv_timestamp). DATA(lv_unix_ts) = CONV i( lv_timestamp ).
" Tags formatieren DATA(lv_tags) = concat_lines_of( table = it_tags sep = '","' ). IF lv_tags IS NOT INITIAL. lv_tags = |"{ lv_tags }"|. ENDIF.
DATA(lv_body) = |{{ | && |"series": [{{ | && |"metric": "abap.{ iv_name }", | && |"type": "{ iv_type }", | && |"points": [[{ lv_unix_ts }, { iv_value }]], | && |"tags": [{ lv_tags }] | && |}}] | && |}}|.
send_request( iv_endpoint = '/api/v2/series' iv_body = lv_body ). ENDMETHOD.
METHOD send_event. " Datadog Events API DATA(lv_tags) = concat_lines_of( table = it_tags sep = '","' ). IF lv_tags IS NOT INITIAL. lv_tags = |"{ lv_tags }"|. ENDIF.
DATA(lv_body) = |{{ | && |"title": "{ iv_title }", | && |"text": "{ iv_text }", | && |"alert_type": "{ iv_type }", | && |"source_type_name": "abap_cloud", | && |"tags": [{ lv_tags }] | && |}}|.
send_request( iv_endpoint = '/api/v1/events' iv_body = lv_body ). ENDMETHOD.
METHOD send_log. " Datadog Logs API DATA(lv_tags) = concat_lines_of( table = it_tags sep = ',' ).
DATA(lv_body) = |{{ | && |"message": "{ iv_message }", | && |"ddsource": "abap", | && |"ddtags": "{ lv_tags }", | && |"service": "{ iv_service }", | && |"status": "{ iv_level }" | && |}}|.
send_request( iv_endpoint = '/api/v2/logs' iv_body = lv_body ). ENDMETHOD.
METHOD send_request. DATA(lv_url) = |https://api.{ mv_site }{ iv_endpoint }|.
DATA(lo_destination) = cl_http_destination_provider=>create_by_url( i_url = lv_url ). DATA(lo_client) = cl_web_http_client_manager=>create_by_http_destination( i_destination = lo_destination ).
DATA(lo_request) = lo_client->get_http_request( ). lo_request->set_header_field( i_name = 'Content-Type' i_value = 'application/json' ). lo_request->set_header_field( i_name = 'DD-API-KEY' i_value = mv_api_key ).
IF mv_app_key IS NOT INITIAL. lo_request->set_header_field( i_name = 'DD-APPLICATION-KEY' i_value = mv_app_key ). ENDIF.
lo_request->set_text( iv_body ).
DATA(lo_response) = lo_client->execute( if_web_http_client=>post ). DATA(lv_status) = lo_response->get_status( )-code.
lo_client->close( ).
IF lv_status < 200 OR lv_status >= 300. RAISE EXCEPTION TYPE cx_web_http_client_error EXPORTING http_status_code = lv_status. ENDIF. ENDMETHOD.
ENDCLASS.Verwendung der Datadog Integration
" Datadog Exporter verwendenDATA(lo_datadog) = NEW zcl_datadog_exporter( iv_api_key = 'YOUR_API_KEY' iv_site = 'datadoghq.eu' " EU-Region).
TRY. " Metrik senden lo_datadog->send_metric( iv_name = 'order_processing_time_ms' iv_value = 250 iv_type = 'gauge' it_tags = VALUE #( ( 'env:production' ) ( 'service:orders' ) ) ).
" Event senden lo_datadog->send_event( iv_title = 'Deployment completed' iv_text = 'New version deployed to production' iv_type = 'info' it_tags = VALUE #( ( 'version:1.2.3' ) ) ).
" Log senden lo_datadog->send_log( iv_message = 'Order 12345 processed successfully' iv_level = 'INFO' iv_service = 'order-service' it_tags = VALUE #( ( 'order_id:12345' ) ) ).
CATCH cx_web_http_client_error INTO DATA(lx_error). " Fehler beim Senden - lokal loggenENDTRY.Webhook-basiertes Alerting
" Generischer Webhook Sender fuer AlertingCLASS zcl_webhook_alerter DEFINITION PUBLIC FINAL CREATE PUBLIC.
PUBLIC SECTION. METHODS: send_slack_alert IMPORTING iv_webhook_url TYPE string iv_message TYPE string iv_severity TYPE string DEFAULT 'warning' RAISING cx_web_http_client_error.
METHODS: send_teams_alert IMPORTING iv_webhook_url TYPE string iv_title TYPE string iv_message TYPE string iv_color TYPE string DEFAULT 'FF9900' RAISING cx_web_http_client_error.
METHODS: send_pagerduty_alert IMPORTING iv_routing_key TYPE string iv_summary TYPE string iv_severity TYPE string DEFAULT 'warning' iv_source TYPE string RAISING cx_web_http_client_error.
ENDCLASS.
CLASS zcl_webhook_alerter IMPLEMENTATION.
METHOD send_slack_alert. DATA(lv_emoji) = SWITCH string( iv_severity WHEN 'critical' THEN ':rotating_light:' WHEN 'error' THEN ':x:' WHEN 'warning' THEN ':warning:' ELSE ':information_source:' ).
DATA(lv_body) = |{{ | && |"text": "{ lv_emoji } *ABAP Cloud Alert*\\n{ iv_message }" | && |}}|.
DATA(lo_destination) = cl_http_destination_provider=>create_by_url( i_url = iv_webhook_url ). DATA(lo_client) = cl_web_http_client_manager=>create_by_http_destination( i_destination = lo_destination ).
DATA(lo_request) = lo_client->get_http_request( ). lo_request->set_header_field( i_name = 'Content-Type' i_value = 'application/json' ). lo_request->set_text( lv_body ).
DATA(lo_response) = lo_client->execute( if_web_http_client=>post ). lo_client->close( ). ENDMETHOD.
METHOD send_teams_alert. DATA(lv_body) = |{{ | && |"@type": "MessageCard", | && |"@context": "http://schema.org/extensions", | && |"themeColor": "{ iv_color }", | && |"summary": "{ iv_title }", | && |"sections": [{{ | && |"activityTitle": "{ iv_title }", | && |"text": "{ iv_message }" | && |}}] | && |}}|.
DATA(lo_destination) = cl_http_destination_provider=>create_by_url( i_url = iv_webhook_url ). DATA(lo_client) = cl_web_http_client_manager=>create_by_http_destination( i_destination = lo_destination ).
DATA(lo_request) = lo_client->get_http_request( ). lo_request->set_header_field( i_name = 'Content-Type' i_value = 'application/json' ). lo_request->set_text( lv_body ).
DATA(lo_response) = lo_client->execute( if_web_http_client=>post ). lo_client->close( ). ENDMETHOD.
METHOD send_pagerduty_alert. DATA(lv_body) = |{{ | && |"routing_key": "{ iv_routing_key }", | && |"event_action": "trigger", | && |"payload": {{ | && |"summary": "{ iv_summary }", | && |"severity": "{ iv_severity }", | && |"source": "{ iv_source }", | && |"component": "abap-cloud" | && |}} | && |}}|.
DATA(lo_destination) = cl_http_destination_provider=>create_by_url( i_url = 'https://events.pagerduty.com/v2/enqueue' ). DATA(lo_client) = cl_web_http_client_manager=>create_by_http_destination( i_destination = lo_destination ).
DATA(lo_request) = lo_client->get_http_request( ). lo_request->set_header_field( i_name = 'Content-Type' i_value = 'application/json' ). lo_request->set_text( lv_body ).
DATA(lo_response) = lo_client->execute( if_web_http_client=>post ). lo_client->close( ). ENDMETHOD.
ENDCLASS.Best Practices fuer SLA-Monitoring
SLA-Definitionen
| Metrik | Ziel | Messmethode |
|---|---|---|
| Availability | 99.9% | Uptime / Total Time |
| Response Time | P95 < 2s | 95. Perzentil der Antwortzeiten |
| Error Rate | < 0.1% | Errors / Total Requests |
| MTTR | < 1 Stunde | Mean Time To Recovery |
SLA Monitor implementieren
" SLA Monitoring KlasseCLASS zcl_sla_monitor DEFINITION PUBLIC FINAL CREATE PUBLIC.
PUBLIC SECTION. TYPES: BEGIN OF ty_sla_status, availability_percent TYPE decfloat34, avg_response_ms TYPE decfloat34, p95_response_ms TYPE decfloat34, error_rate_percent TYPE decfloat34, sla_compliant TYPE abap_bool, violations TYPE string_table, END OF ty_sla_status.
METHODS: check_sla_compliance IMPORTING iv_period_hours TYPE i DEFAULT 24 RETURNING VALUE(rs_status) TYPE ty_sla_status.
METHODS: generate_sla_report IMPORTING iv_period_days TYPE i DEFAULT 30 RETURNING VALUE(rv_report) TYPE string.
ENDCLASS.
CLASS zcl_sla_monitor IMPLEMENTATION.
METHOD check_sla_compliance. " SLA-Ziele (normalerweise aus Konfiguration) DATA(lv_availability_target) = CONV decfloat34( '99.9' ). DATA(lv_response_target_ms) = 2000. DATA(lv_error_rate_target) = CONV decfloat34( '0.1' ).
" Metriken sammeln DATA(lo_dashboard) = NEW zcl_monitoring_dashboard( ). DATA(ls_data) = lo_dashboard->get_dashboard_data( ).
" Availability berechnen rs_status-availability_percent = ls_data-uptime_percent.
" Response Time (aus Metriken) rs_status-avg_response_ms = ls_data-avg_response_ms. rs_status-p95_response_ms = ls_data-avg_response_ms * CONV decfloat34( '1.5' ). " Vereinfacht
" Error Rate rs_status-error_rate_percent = COND #( WHEN ls_data-total_requests > 0 THEN ls_data-error_count * 100 / ls_data-total_requests ELSE 0 ).
" SLA-Verletzungen pruefen rs_status-sla_compliant = abap_true.
IF rs_status-availability_percent < lv_availability_target. APPEND |Availability { rs_status-availability_percent }% < { lv_availability_target }%| TO rs_status-violations. rs_status-sla_compliant = abap_false. ENDIF.
IF rs_status-p95_response_ms > lv_response_target_ms. APPEND |P95 Response { rs_status-p95_response_ms }ms > { lv_response_target_ms }ms| TO rs_status-violations. rs_status-sla_compliant = abap_false. ENDIF.
IF rs_status-error_rate_percent > lv_error_rate_target. APPEND |Error Rate { rs_status-error_rate_percent }% > { lv_error_rate_target }%| TO rs_status-violations. rs_status-sla_compliant = abap_false. ENDIF. ENDMETHOD.
METHOD generate_sla_report. DATA(ls_status) = check_sla_compliance( iv_period_hours = iv_period_days * 24 ).
rv_report = |=== SLA Report ===\n|. rv_report = rv_report && |Period: Last { iv_period_days } days\n\n|.
rv_report = rv_report && |Availability: { ls_status-availability_percent }%\n|. rv_report = rv_report && |Avg Response: { ls_status-avg_response_ms }ms\n|. rv_report = rv_report && |P95 Response: { ls_status-p95_response_ms }ms\n|. rv_report = rv_report && |Error Rate: { ls_status-error_rate_percent }%\n\n|.
IF ls_status-sla_compliant = abap_true. rv_report = rv_report && |Status: SLA COMPLIANT\n|. ELSE. rv_report = rv_report && |Status: SLA VIOLATED\n|. rv_report = rv_report && |Violations:\n|. LOOP AT ls_status-violations INTO DATA(lv_violation). rv_report = rv_report && |- { lv_violation }\n|. ENDLOOP. ENDIF. ENDMETHOD.
ENDCLASS.Monitoring Best Practices Checkliste
| Bereich | Best Practice | Prioritaet |
|---|---|---|
| Metriken | Business KPIs definieren und messen | Hoch |
| Metriken | Technische Metriken (CPU, Memory, Response) | Hoch |
| Metriken | Metriken mit Labels/Dimensions versehen | Mittel |
| Health Checks | Liveness und Readiness trennen | Hoch |
| Health Checks | Abhaengigkeiten pruefen | Mittel |
| Health Checks | Timeouts konfigurieren | Mittel |
| Alerting | Severity-Stufen definieren | Hoch |
| Alerting | Cooldown zwischen Alerts | Mittel |
| Alerting | Eskalationspfade dokumentieren | Hoch |
| Dashboards | Uebersichts-Dashboard mit KPIs | Hoch |
| Dashboards | Drill-Down-Moeglichkeiten | Mittel |
| SLA | SLA-Ziele dokumentieren | Hoch |
| SLA | Automatische SLA-Reports | Mittel |
| Integration | Externe Tools anbinden | Mittel |
Weiterfuehrende Themen
- ABAP Environment Monitoring - Basis-Monitoring mit Fiori Apps
- Application Logging - Detaillierte Protokollierung
- HTTP Client - Externe APIs aufrufen
Zusammenfassung
Effektives Monitoring und Alerting fuer ABAP Cloud erfordert:
- Custom Metriken - Business- und technische KPIs definieren und sammeln
- Health Check Endpoints - Liveness, Readiness und Deep Checks implementieren
- Alert Rules - Schwellwerte und Severity-Stufen konfigurieren
- Dashboards - Uebersichtliche Visualisierung der wichtigsten Metriken
- Externe Integration - Tools wie Datadog, Slack oder PagerDuty anbinden
- SLA-Monitoring - Compliance kontinuierlich pruefen und dokumentieren
Die Kombination aus SAP Cloud ALM, Application Logging und externen Monitoring-Tools ermoeglicht eine umfassende Ueberwachung und schnelle Reaktion auf Probleme.