Support Questions

Find answers, ask questions, and share your expertise

Import custom files in dag-file Airflow jobs CDE

avatar
Explorer

I'm working with Airflow jobs in CDE. In my case, I have a resource that contains two files: one is the DAG file, and the other is a custom Python script that I want to import into the DAG.

According to the documentation, resources in Airflow are mounted under /app/mount/<PREFIX>. However, when a job is created or updated, the DAG is validated before the resources are actually mounted in that folder, which causes the import to fail.

If I encapsulate the import inside a PythonOperator, it works without issues, because at that point the resources are already available and the file can be found. The error I get during validation is always something like:

"failed: invalid request: Invalid DAG: dag_import: Traceback (most recent call last): File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "/tmp/tmpfx_nrhlx/dag.py", line 6, in <module> from myModule import func ModuleNotFoundError: No module named 'myModule'"

I’ve tried not only creating the job with the import, but also creating an empty job first (so that the /app/mount/ folder is created), and then updating the DAG file — but the error remains the same.

Has anyone faced this issue and found a solution to make the import work from the beginning, without having to encapsulate it inside a function?

2 REPLIES 2

avatar
Contributor

Hello @ariajesus

Welcome to our community. Glad to see you here. 

How did you create the resource? 
As a File Resource or as a Python Environment? 

vafs_0-1757626649759.png

 

Here are the steps how you can create it: 
https://docs.cloudera.com/data-engineering/1.5.4/use-resources/topics/cde-create-python-virtual-env.... 


Regards,
Andrés Fallas
--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs-up button.

avatar
Explorer

I create a resource as a file, because the python-env resources are specifically for managing Python packages in requirements.txt, according to the documentation. 

Thanks!