Created 09-10-2025 01:02 PM
I'm working with Airflow jobs in CDE. In my case, I have a resource that contains two files: one is the DAG file, and the other is a custom Python script that I want to import into the DAG.
According to the documentation, resources in Airflow are mounted under /app/mount/<PREFIX>. However, when a job is created or updated, the DAG is validated before the resources are actually mounted in that folder, which causes the import to fail.
If I encapsulate the import inside a PythonOperator, it works without issues, because at that point the resources are already available and the file can be found. The error I get during validation is always something like:
"failed: invalid request: Invalid DAG: dag_import: Traceback (most recent call last): File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "/tmp/tmpfx_nrhlx/dag.py", line 6, in <module> from myModule import func ModuleNotFoundError: No module named 'myModule'"
I’ve tried not only creating the job with the import, but also creating an empty job first (so that the /app/mount/ folder is created), and then updating the DAG file — but the error remains the same.
Has anyone faced this issue and found a solution to make the import work from the beginning, without having to encapsulate it inside a function?
Created 09-11-2025 02:37 PM
Hello @ariajesus,
Welcome to our community. Glad to see you here.
How did you create the resource?
As a File Resource or as a Python Environment?
Here are the steps how you can create it:
https://docs.cloudera.com/data-engineering/1.5.4/use-resources/topics/cde-create-python-virtual-env....
Created 09-11-2025 03:03 PM
I create a resource as a file, because the python-env resources are specifically for managing Python packages in requirements.txt, according to the documentation.
Thanks!