Calzone -Data pipeline: Consumer lag¶
Archived (pre-2022)
Preserved for reference only -- likely outdated. View original | Last updated: June 2019
Link: Grafana - Calzone Data Pipeline

Meaning: this lag is between changed hour (detected from spike) and the consumer trigger from this airflow dag.
Possible causes and solutions¶
- The problem in that graph could be in airflow not triggering the dag or the consumer not able to catch up
Airflow dag Calzone_populate_mysql: Airflow - Tree
How to troubleshoot: - check airflow scheduler works: Grafana - Airflow Core, pay attention to 'Airflow Health Check DAG' graph
- go to aws ecs console (Home), then to scheduler service, there you will find one running task, kill it, this will cause ecs spawn new scheduler container for airflow
Config for this dag: calzone_config.json (Github)
-
The problem is in the
appenderconsumer in Marathon. After started other applications it get stuck sometimes and it is not creating any list of your to process that will be consumed by the airflow consumer.
So the airflow consumer was working but was not finding any hour to process
Restart theappendercould solve the issue -
In addition try to restart calzone worker tasks as well
Impact¶
- calzone drud won't be populated with new hours