Support Questions

Find answers, ask questions, and share your expertise

Best Practices to develop with HUE workflow and using PySpark

I'm using HDP 2.4 and HUE 3.12

I want to know what are the best practices to work/develop in HUE workflow using pyspark code.

I have already tried without success, I have tried to launch a hue shell action in the workflow (oozie-shell action), for that i used the following shell script code:


But keep receiving errors, that not found but i have confirmed that the file is in the container that is launched by the yarn, I also tried to pass the file to all the machines that have oozie client on the same path but then the error was permission denied but i have confirmed that the file was with all the permissions for all the users.

Can someone help me.



Super Guru

Hue is not the greatest place to do anything.

I would recommend upgrade to HDP 2.5 or 2.6 and using Apache Zeppelin. Great for PySpark.

try spark submit from the command line.