Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Error running spark job using Hue

Error running spark job using Hue

Explorer

I am trying to run a spark job using Hue.

I get the following error despite having the class in the jar that i supplied(guava-14.0.1.jar)

 

Thought its because of Hue having loaded a different version of guava jar . This is what i see in the list of jars loaded by hue : "/usr/lib/hive/lib/guava-11.0.2.jar"

 

Is this the reason for error? If so, what is the resolution?

If not, what else is causing NoSuchMethodError?

 

<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [com.xyz.jobs.ReprocessStreamAggregates], main() threw exception, com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
java.lang.NoSuchMethodError: com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
at org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
at org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
at org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
at org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
5 REPLIES 5

Re: Error running spark job using Hue

Make sure that the jar is in a 'lib' directory in the HDFS workspace of the
workflow

Romain

Re: Error running spark job using Hue

Explorer

We have created a jar which contains our class files and the dependant libraries using "maven-shade-plugin".

So the contents of guava 14.0.1 is inside our jar which is uploaded to lib folder of the workspace.

Still i see the error

Re: Error running spark job using Hue

New Contributor

Any progress made on this? I got the same problem running a spark script from the commandline.

Re: Error running spark job using Hue

Explorer

No luck

Highlighted

Re: Error running spark job using Hue

New Contributor

When I add the correct version of the guava jar file on the classpath then the spark job can be executed from the commandline:

 

#!/bin/bash

 

# Directory where this script is stored.

SCRIPT_DIR="$(dirname "$0")"

echo"SCRIPT_DIR=$SCRIPT_DIR"

 

source $SCRIPT_DIR/init-spark-env.sh

 

#################

SPARK_JAR="/xxxx/spark/dp-spark-0.1-SNAPSHOT.jar"

MAIN_CLASS="be.xxxx.spark.examples.SparkWordCount"

#################

PARAM1="hdfs:///xxxx/input/spark-test/moby_dick.txt"

PARAM2="2"

#################

 

# app jars:

CLASSPATH=/opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/lib/spark/lib/guava-14.0.1.jar:$CLASSPATH:$SPARK_JAR

 

#CONFIG_OPTS="-Dspark.master=local -Dspark.jars=/xxxx/spark/dp-spark-0.1-SNAPSHOT.jar"

CONFIG_OPTS="-Dspark.master=spark://cloudera4:7077 -Dspark.jars=$SPARK_JAR"

 

# -Dspark.master specifies the cluster against which to run the application; local will run all tasks in the same local process.

#  To run against a Spark standalone cluster instead, include a URL containing the master?^?^?s address (such as spark://masterhost:7077).

#  To run against a YARN cluster, include yarn-client ?^?^? Spark will determine the YARN ResourceManager?^?^?s address from the YARN configuration fi$

 

java -cp$CLASSPATH$CONFIG_OPTS$MAIN_CLASS$PARAM1$PARAM2