Support Questions
Find answers, ask questions, and share your expertise

SparkSQL with Hive on CDH clusters

New Contributor

I'm submitting SparkSQL jobs on a CDH 5.4 spark cluster from a remote machine, the job works well as long as I don't use HiveContext.

With HiveContext Spark looks for the Hive jars that are not bundled with the Cloudera Spark assembly ( /usr/lib/spark/lib/spark-assembly-1.3.0-cdh5.4.0-hadoop2.6.0-cdh5.4.0.jar).

Since I don't have access to the cluster (I can only submit jobs) I'm looking for a workaround.

 

Can I add all of hive/hadoop.hive cdh5.4 jars to my remote machine then add them to the classpath and submit the job ? 

Is there another workaround ? maybe a cloudera assembly with Hive jars ?

 

Best Regards,