ObserveNow
...
Integrations
Infrastructure

Apache Airflow

metrics instrumentation for offical apache helm chart if you are making use of the official apache helm chart https //github com/apache/airflow/tree/main/chart , make sure the statsd exporter is enabled with proper service annotations, if not, you can enable it by adding the following snippet to the values yaml file of the airflow deployment statsd enabled true service extraannotations prometheus io/port "9102" prometheus io/scrape "true" you will also need to add the following extramappings snippet under the statsd configurations of the values yaml file of the airflow deployment statsd extramappings \ match "( +)\\\\ ( +) start$" match metric type counter name "af agg job start" match type regex labels airflow id "$1" job name "$2" \ match "( +)\\\\ ( +) end$" match metric type counter name "af agg job end" match type regex labels airflow id "$1" job name "$2" \ match "( +)\\\\ operator failures ( +)$" match metric type counter name "af agg operator failures" match type regex labels airflow id "$1" operator name "$2" \ match "( +)\\\\ operator successes ( +)$" match metric type counter name "af agg operator successes" match type regex labels airflow id "$1" operator name "$2" \ match " ti failures" match metric type counter name "af agg ti failures" labels airflow id "$1" \ match " ti successes" match metric type counter name "af agg ti successes" labels airflow id "$1" \ match " zombies killed" match metric type counter name "af agg zombies killed" labels airflow id "$1" \ match " scheduler heartbeat" match metric type counter name "af agg scheduler heartbeat" labels airflow id "$1" \ match " dag processing processes" match metric type counter name "af agg dag processing processes" labels airflow id "$1" \ match " scheduler tasks killed externally" match metric type counter name "af agg scheduler tasks killed externally" labels airflow id "$1" \ match " scheduler tasks running" match metric type counter name "af agg scheduler tasks running" labels airflow id "$1" \ match " scheduler tasks starving" match metric type counter name "af agg scheduler tasks starving" labels airflow id "$1" \ match " scheduler orphaned tasks cleared" match metric type counter name "af agg scheduler orphaned tasks cleared" labels airflow id "$1" \ match " scheduler orphaned tasks adopted" match metric type counter name "af agg scheduler orphaned tasks adopted" labels airflow id "$1" \ match " scheduler critical section busy" match metric type counter name "af agg scheduler critical section busy" labels airflow id "$1" \ match " sla email notification failure" match metric type counter name "af agg sla email notification failure" labels airflow id "$1" \ match " ti start " match metric type counter name "af agg ti start" labels airflow id "$1" dag id "$2" task id "$3" \ match " ti finish " match metric type counter name "af agg ti finish" labels airflow id "$1" dag id "$2" task id "$3" state "$4" \ match " dag callback exceptions" match metric type counter name "af agg dag callback exceptions" labels airflow id "$1" \ match " celery task timeout error" match metric type counter name "af agg celery task timeout error" labels airflow id "$1" \# === gauges === \ match " dagbag size" match metric type gauge name "af agg dagbag size" labels airflow id "$1" \ match " dag processing import errors" match metric type gauge name "af agg dag processing import errors" labels airflow id "$1" \ match " dag processing total parse time" match metric type gauge name "af agg dag processing total parse time" labels airflow id "$1" \ match " dag processing last runtime " match metric type gauge name "af agg dag processing last runtime" labels airflow id "$1" dag file "$2" \ match " dag processing last run seconds ago " match metric type gauge name "af agg dag processing last run seconds" labels airflow id "$1" dag file "$2" \ match " dag processing processor timeouts" match metric type gauge name "af agg dag processing processor timeouts" labels airflow id "$1" \ match " executor open slots" match metric type gauge name "af agg executor open slots" labels airflow id "$1" \ match " executor queued tasks" match metric type gauge name "af agg executor queued tasks" labels airflow id "$1" \ match " executor running tasks" match metric type gauge name "af agg executor running tasks" labels airflow id "$1" \ match " pool open slots " match metric type gauge name "af agg pool open slots" labels airflow id "$1" pool name "$2" \ match " pool queued slots " match metric type gauge name "af agg pool queued slots" labels airflow id "$1" pool name "$2" \ match " pool running slots " match metric type gauge name "af agg pool running slots" labels airflow id "$1" pool name "$2" \ match " pool starving tasks " match metric type gauge name "af agg pool starving tasks" labels airflow id "$1" pool name "$2" \ match " smart sensor operator poked tasks" match metric type gauge name "af agg smart sensor operator poked tasks" labels airflow id "$1" \ match " smart sensor operator poked success" match metric type gauge name "af agg smart sensor operator poked success" labels airflow id "$1" \ match " smart sensor operator poked exception" match metric type gauge name "af agg smart sensor operator poked exception" labels airflow id "$1" \ match " smart sensor operator exception failures" match metric type gauge name "af agg smart sensor operator exception failures" labels airflow id "$1" \ match " smart sensor operator infra failures" match metric type gauge name "af agg smart sensor operator infra failures" labels airflow id "$1" \# === timers === \ match " dagrun dependency check " match metric type observer name "af agg dagrun dependency check" labels airflow id "$1" dag id "$2" \ match " dag duration" match metric type observer name "af agg dag task duration" labels airflow id "$1" dag id "$2" task id "$3" \ match " dag processing last duration " match metric type observer name "af agg dag processing duration" labels airflow id "$1" dag file "$2" \ match " dagrun duration success " match metric type observer name "af agg dagrun duration success" labels airflow id "$1" dag id "$2" \ match " dagrun duration failed " match metric type observer name "af agg dagrun duration failed" labels airflow id "$1" dag id "$2" \ match " dagrun schedule delay " match metric type observer name "af agg dagrun schedule delay" labels airflow id "$1" dag id "$2" \ match " scheduler critical section duration" match metric type observer name "af agg scheduler critical section duration" labels airflow id "$1" \ match " dagrun first task scheduling delay" match metric type observer name "af agg dagrun first task scheduling delay" labels airflow id "$1" dag id "$2" the airflow cluster dashboard https //github com/databand ai/airflow dashboards/blob/main/grafana/cluster dashboard json can be added into your grafana instance for visualization instrumentation for community helm chart if you are making use of the community helm chart https //github com/airflow helm/charts/tree/main/charts/airflow , you can enable metrics instrumentaion by following any of the below mentioned methods configuring airflow exporter for metrics using airflow exporter , you can enable metrics by setting the following in the values yaml file of the airflow deployment airflow extrapippackages \["airflow exporter"] and web service annotations prometheus io/path /admin/metrics prometheus io/port "8080" prometheus io/scrape "true" configuring opentelemetry for metrics you can use otel for instrumenting airflow metrics by setting the following in the values yaml file of the airflow deployment airflow extrapippackages \ "apache airflow\[otel]" config airflow metrics otel on "true" airflow metrics otel host "\<otel collector service name> \<namespace> svc cluster local" airflow metrics otel port 4318 airflow metrics otel prefix "airflow" for more configuration options for metrics, you can refer the airflow otel metrics documentaion update your airflow deployment using the helm upgrade command and you should be able to see metrics coming to your grafana traces otel instrumentation for community airflow helm chart if you are making use of the community helm chart , you can configure your airflow instance to send traces to your grafana tracing can only be configured if you are using airflow version 2 10 1 and above for the versions below that, airflow does not support traces instrumentation to configure tracing, please add the traces configuration and the apache airflow\[otel] package for your airflow by updating values yaml as follows airflow extrapippackages \ "apache airflow\[otel]" config airflow traces otel on "true" airflow traces otel host "\<otel collector service name> \<namespace> svc cluster local" airflow traces otel port 4318 airflow traces otel task log event "true" airflow traces otel service "airflow" for more configuration options for traces, you can check the airlfow traces documentation if you set otel debugging on to true , airflow will print traces to the console instead of sending it to configured host update your airflow deployment using the helm upgrade command and you should be able to see traces coming to your grafana