Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Trouble running Java Spark Hive Example

avatar
Explorer

I have the following Java Spark Hive Example as can be found on the official apache/spark Github. I have spend a lot of time understanding how to run the example in my Hortonworks Hadoop Sandbox without success.

Currently, I am doing the following:

  • Importing the apache/spark examples as I Maven-project, this is working fine and I am not getting any issues with decencies so no problem here I'll guess.
  • The next step is to prepare the code to run in my Hadoop Sandbox - the issue is starting here, I am probably setting something wrong to being with. This is what I am doing:

Setting the SparkSession to master local, changing spark.sql.warehouse.dir to hive.metastore.uris and set thrift://localhost:9083 (as I can see in the Hive confing in Ambari) as warehouseLocation.

SparkSession spark =SparkSession.builder().appName("Java Spark Hive Example").master("local[*]").config("hive.metastore.uris","thrift://localhost:9083").enableHiveSupport().getOrCreate();

Then I replace spark.sql("LOAD DATA LOCAL INPATH 'examples/src/main/resources/kv1.txt' INTO TABLE src");

with a path to hdfs where I have uploaded kv1.txt:

spark.sql("LOAD DATA LOCAL INPATH 'hdfs:///tmp/kv1.txt' INTO TABLE src");

The last step is to make the JAR with mvn package on the pom.xml - it builds without errors and gives me original-spark-examples_2.11-2.3.0-SNAPSHOT.jar

I copy the assembly over to the Hadoop Sandbox scp -P 2222 ./target/original-spark-examples_2.11-2.3.0-SNAPSHOT.jar root@sandbox.hortonworks.com:/root

and use spark-submit to run the code /usr/hdp/current/spark2-client/bin/spark-submit --class "JavaSparkHiveExample" --master local ./original-spark-examples_2.11-2.3.0-SNAPSHOT.jar

Which return the following error:

[root@sandbox-hdp ~]#/usr/hdp/current/spark2-client/bin/spark-submit --class"JavaSparkHiveExample"--master local ./original-spark-examples_2.11-2.3.0-SNAPSHOT.jar
java.lang.ClassNotFoundException:JavaSparkHiveExample
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(NativeMethod)
        at java.lang.Class.forName(Class.java:348)
        at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:739)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)[root@sandbox-hdp ~]#

..and here I am totally stuck, probably I am missing some steps to prepare the code to run and so on.

I would be very happy if I could get some help to get this code to run on my Hadoop Sandbox. I was able to run the JavaWordCount.java Spark example just fine but with this one I am totally stuck. Thanks 🙂

Complete JavaSparkHiveExample.java

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Hi @Eric H,

could you please check the complete class name with the package name

--class "org.apache.spark.examples.sql.hive.JavaSparkHiveExample"

as that particular class under the package it couldn't reference directly.

Hope this helps !!

View solution in original post

2 REPLIES 2

avatar
Super Collaborator

Hi @Eric H,

could you please check the complete class name with the package name

--class "org.apache.spark.examples.sql.hive.JavaSparkHiveExample"

as that particular class under the package it couldn't reference directly.

Hope this helps !!

avatar
Explorer

Hi @bkosaraju,

That solved the problem. Many thanks for your help!