- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
spark sql is not working on CDH5.3
Created on ‎02-23-2015 10:48 AM - edited ‎09-16-2022 02:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We upgraded to CDH5.3. spark-sql is not working. Could any bodyk please provide us am I missing any steps we need to follow to spark-sql to work. I'm getting the below error.
Steps we did:
1.copy the hive-site.xml to /etc/spark/conf
2.Try to start the thrifiserver but getting the error.
/opt/cloudera/parcels/CDH/lib/spark/sbin/start-thriftserver.sh
starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /var/log/spark/spark-root-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-rkqcrl-odshaun01.out
failed to launch org.apache.spark.sql.hive.thriftserver.HiveThriftServer2:
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 18 more
bash-4.1$ ./spark-sql
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/cli/CliDriver
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:412)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:342)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.cli.CliDriver
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 18 more
Regards,
Venky
Created ‎03-25-2015 09:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎02-23-2015 10:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So, Spark SQL is shipped unchanged from upstream. It should mostly work as-is, as a result. It is not formally supported, as it's still an alpha component. Here in particular, have a look at other threads on this forum. I think the issue is that Spark SQL is not yet compatible with the later version of Hive in CDH, so it's not built with Hive support. Some of it should still work, but you have to add the Hive JARs to the classpath at least.
Created ‎02-23-2015 11:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for quick reply. I have added the path too. But still not working. correct me if I'm wrong or missing something.
#!/bin/bash
export JAVA_HOME=/usr/java/jdk1.7.0_55
SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark/
SPARK_CLASSPATH=/opt/cloudera/parcels/CDH/lib/spark/lib/*.jar
JARS=""
for j in `ls /opt/cloudera/parcels/CDH/lib/hadoop/client/*.jar`
do
JARS=$JARS:$j
JARS1=$j,$JARS1
done
CLI=/opt/cloudera/parcels/CDH/lib/hive/lib/hive-cli-0.13.1-cdh5.3.1.jar:/opt/cloudera/parcels/CDH/lib/hive/lib/hive-common-0.13.1-cdh5.3.1.jar:=/opt/cloudera/parcels/CDH/lib/hive/lib/hive-jdbc-0.13.1-cdh5.3.1.jar:/opt/cloudera/parcels/CDH/lib/hive/lib/hive-exec-0.13.1-cdh5.3.1.jar
$SPARK_CLASSPATH:$JARS:$CLI
$SPARK_HOME/bin/spark-sql --master local
Created ‎02-23-2015 11:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You're probably beyond my knowledge. But the immediate error is easy enough to understand; it can't find the Hive classes, so something is still wrong there. I see a typo in your path for example; there are two jars separated by ":=" Is it just that?
Created ‎02-23-2015 12:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
my bad. I fixed the typo. But still no luck. Thanks.
Created ‎03-24-2015 10:52 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Anyone has luck on this?
Created ‎03-24-2015 11:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
There is bug in the classpath.
You need add a line int the compute-classpath.sh CLASSPATH="$CLASSPATH:/opt/cloudera/parcels/CDH/lib/hive/lib/*" . Then it will work without any issues.
Regards,
Venkat
Created ‎03-25-2015 09:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎03-25-2015 09:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To add a little color, yes you can do that, although the CLASSPATH intentionally does not include Hive, since as I understand, Spark doesn't work with the later versions of Hive that CDH 5.3 and beyond use. It still may work enough to do what you need, so, have at it. But you may hit some incompatibilities.
Created on ‎03-25-2015 10:00 AM - edited ‎03-25-2015 12:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I agree. What is the best solution for this?
