Support Questions

Find answers, ask questions, and share your expertise

Best Practices to develop with HUE workflow and using PySpark

I'm using HDP 2.4 and HUE 3.12

I want to know what are the best practices to work/develop in HUE workflow using pyspark code.

I have already tried without success, I have tried to launch a hue shell action in the workflow (oozie-shell action), for that i used the following shell script code:

spark-submit testing.py

But keep receiving errors, that not found testing.py but i have confirmed that the file is in the container that is launched by the yarn, I also tried to pass the file to all the machines that have oozie client on the same path but then the error was permission denied but i have confirmed that the file was with all the permissions for all the users.

Can someone help me.

Thanks

2 REPLIES 2

Super Guru

Hue is not the greatest place to do anything.

I would recommend upgrade to HDP 2.5 or 2.6 and using Apache Zeppelin. Great for PySpark.

try spark submit from the command line.