Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

python module not found when running through yarn

Highlighted

python module not found when running through yarn

Contributor

My cluster nodes have two versions on python installed. 2.6.6 in /usr/bin 2.7.12 in /usr/local/bin. I installed some python modules for the 2.7.12 version associated with geolocation. When I run a job locally on one of the nodes it runs fine. When it is submitted through yarn I get the following

“ImportError: No module named ipaddress”

ipaddress is one of the modules I installed.

I suspect yarn is using the 2.6.6 version of python. How can I determine if this is the case and if it is how can I define yarn to use the python in /usr/local/bin?

Thanks

1 REPLY 1

Re: python module not found when running through yarn

Expert Contributor

@Jon Page

Could you please provide more information about what kind of "job" are you trying to run through yarn? Are you using Spark? Custom native YARN App? Distributed Shell? Knit?

Nevertheless, you could run a simple distributed shell app to see which python version YARN picks up:

yarn jar path/to/hadoop-yarn-applications-distributedshell.jar -jar path/to/hadoop-yarn-applications-distributedshell.jar -shell_command python -shell_args -V

Or you can check the same with the framework you are using.

Don't have an account?
Coming from Hortonworks? Activate your account here