I am currently going through this guide on ORC format table in Hive via Spark, https://www.cloudera.com/tutorials/using-hive-with-orc-in-apache-spark-repl.html
I have been having a lot of issues with the VMWare HDP Sandbox since the start, which I assume is because I am running it with only 2 CPUs and 22GB of RAM allocated.
The spark-shell command did execute fully once, but never after that till date. I have tried setting ZooKeeper, Oozie, and Ranger to maintanence mode to allow for some space to run, but this strategy is no longer working,
Spark-shell command is always stuck after the log level setting prompt. I have ensured that YARN server is running, but to no avail.
Any help is appreciated.