Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Best Practices to develop with HUE workflow and using PySpark

Best Practices to develop with HUE workflow and using PySpark

Explorer

I'm using HDP 2.4 and HUE 3.12

I want to know what are the best practices to work/develop in HUE workflow using pyspark code.

I have already tried without success, I have tried to launch a hue shell action in the workflow (oozie-shell action), for that i used the following shell script code:

spark-submit testing.py

But keep receiving errors, that not found testing.py but i have confirmed that the file is in the container that is launched by the yarn, I also tried to pass the file to all the machines that have oozie client on the same path but then the error was permission denied but i have confirmed that the file was with all the permissions for all the users.

Can someone help me.

Thanks

2 REPLIES 2
Highlighted

Re: Best Practices to develop with HUE workflow and using PySpark

Super Guru

Hue is not the greatest place to do anything.

I would recommend upgrade to HDP 2.5 or 2.6 and using Apache Zeppelin. Great for PySpark.

try spark submit from the command line.

Don't have an account?
Coming from Hortonworks? Activate your account here