Created 08-07-2024 06:45 AM
I have a requirement to gather run duration (time) for the last 3 months, for a particular airflow job.
In our CDE environment we use airflow to call spark DBT jobs, of late the run duration of job have increased drastically.
I assume there would be a way to gather this jobs runtime duration for further analysis etc. Hoping for some assistance / guidance in getting his done.
CDE Version - 1.19.3-b29
Thanks
Wert
Created 04-01-2025 07:15 AM
Hello @wert_1311
Greetings from Cloudera Support. Based on the Post, Your Team wish to capture the Airflow's DAG Run over the past 3 Months.
The Info is Captured in the Airflow mySQL DB (For Public Cloud). You can connect to the Airflow mySQL DB [1] & Capture the Info from the Metadata Tables for DAG Runs. Note that Caution is recommended while accessing the Metadata Tables to avoid any Undue Changes, which may break the CDE Setup.
Regards, Smarak
Created 08-19-2024 07:28 AM
Hello @wert_1311 Thank you for bringing this to our Community.
I see this has already by requested by you here:
[0a]https://community.cloudera.com/t5/Support-Questions/Monitor-alert-long-running-Airflow-jobs/m-p/3883...
Did it not help? If not, Please try the following:
Airflow has the following metrics that can be used:
[0b]https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitori...
For the integration, I'd suggest you explore the feasibility based on your use-case. For example, the Metrics configuration setup is given here for StatD and OpenTelemetry.
[0c] https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitori...
Also, I am citing the third-party apps that may be helpful:
[0d] https://github.com/search?q=airflow+prometheus&type=repositories
Let us know how it goes.
V
Created 04-01-2025 07:15 AM
Hello @wert_1311
Greetings from Cloudera Support. Based on the Post, Your Team wish to capture the Airflow's DAG Run over the past 3 Months.
The Info is Captured in the Airflow mySQL DB (For Public Cloud). You can connect to the Airflow mySQL DB [1] & Capture the Info from the Metadata Tables for DAG Runs. Note that Caution is recommended while accessing the Metadata Tables to avoid any Undue Changes, which may break the CDE Setup.
Regards, Smarak
Created 04-08-2025 06:38 AM
Hello @wert_1311
We hope the above Post answers your Q on the Use-Case of fetching the DAG Run over the past 3 months via Airflow DB. We shall mark the Post as Solved, yet Feel free to Update the Post if there's any further Q.
Regards, Smarak