Reply
New Contributor
Posts: 2
Registered: ‎10-24-2017

Fail to run a simple Spark Application of sbt project in cloudera-quickstart-vm-5.12

 

Hello, folks

Can someone please help me out? I am new and just trying out cloudera-quickstart-vm-5.12

I installed cloudera-quickstart-vm-5.12.0-0 in virtualbox, after that:
1. upgraded JDK jdk1.7.0_67 to jdk1.8.0_152,
2. installed sbt 1.0.2

Then I created a sime sbt project as below. I can run all the code above in Spark-shell without creating the sc. But when I run it in sbt, I got warning and error like below:

Also I got warning message keeps poping up in the screen repeatedly as below

Did i miss or mess up anything?

Your help is really appreciate it, please

---
name := "helloSpark"

version := "1.0"

scalaVersion := "2.10.5"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0"

--- ./project/build.properties
sbt.version = 1.0.2

--- helloSpark.scala
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SparkWordCount {
def main(args: Array[String]) {
// create Spark context with Spark configuration
// val sc = new SparkContext(new SparkConf().setAppName("Hello Spark WordCount"))
val sc = new SparkContext("local[*]", "Hello Spark WordCount")

// get threshold
val wordfile = "./countWords.txt" //args(0)
val threshold = 2 //args(1).toInt

// read in text file and split each document into words
val tokenized = sc.textFile(wordfile).flatMap(_.split(" "))

// count the occurrence of each word
val wordCounts = tokenized.map((_, 1)).reduceByKey(_ + _)

// filter out words with fewer than threshold occurrences
val filtered = wordCounts.filter(_._2 >= threshold)

// count characters
val charCounts = filtered.flatMap(_._1.toCharArray).map((_, 1)).reduceByKey(_ + _)

System.out.println(charCounts.collect().mkString(", "))
}
}

//I can run all the code above in Spark-shell without creating the sc. But when I run it in sbt, I got warning and error like below:

sbt:helloSpark> run
[info] Updating {file:/home/cloudera/workspace/yonga/helloSpark/}hellospark...
[info] Done updating.
[warn] Found version conflict(s) in library dependencies; some are suspected to be binary incompatible:
[warn] * commons-net:commons-net:2.2 is selected over 3.1
[warn] +- org.apache.spark:spark-core_2.10:1.6.0 (depends on 2.2)
[warn] +- org.apache.hadoop:hadoop-common:2.2.0 (depends on 3.1)
[warn] * com.google.guava:guava:14.0.1 is selected over 11.0.2
[warn] +- org.apache.curator:curator-recipes:2.4.0 (depends on 14.0.1)
[warn] +- org.tachyonproject:tachyon-client:0.8.2 (depends on 14.0.1)
[warn] +- org.apache.curator:curator-client:2.4.0 (depends on 14.0.1)
[warn] +- org.tachyonproject:tachyon-underfs-hdfs:0.8.2 (depends on 14.0.1)
[warn] +- org.apache.curator:curator-framework:2.4.0 (depends on 14.0.1)
[warn] +- org.tachyonproject:tachyon-underfs-s3:0.8.2 (depends on 14.0.1)
[warn] +- org.tachyonproject:tachyon-underfs-local:0.8.2 (depends on 14.0.1)
[warn] +- org.apache.hadoop:hadoop-hdfs:2.2.0 (depends on 11.0.2)
[warn] +- org.apache.hadoop:hadoop-common:2.2.0 (depends on 11.0.2)
[warn] * com.google.code.findbugs:jsr305:1.3.9 is selected over 2.0.1
[warn] +- org.apache.spark:spark-network-common_2.10:1.6.0 (depends on 1.3.9)
[warn] +- org.apache.spark:spark-unsafe_2.10:1.6.0 (depends on 1.3.9)
[warn] +- com.google.guava:guava:11.0.2 (depends on 1.3.9)
[warn] +- org.apache.spark:spark-core_2.10:1.6.0 (depends on 1.3.9)
[warn] +- com.fasterxml.jackson.module:jackson-module-scala_2.10:2.4.4 (depends on 2.0.1)
[warn] Run 'evicted' to see detailed eviction warnings
[info] Packaging /home/cloudera/workspace/yonga/helloSpark/target/scala-2.10/hellospark_2.10-1.0.jar ...
[info] Done packaging.
[info] Running SparkWordCount
[debug] Waiting for threads to exit or System.exit to be called.
....
17/10/26 10:17:34 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.io.IOException: java.lang.ClassNotFoundException: scala.Some
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1207)
at org.apache.spark.Accumulable.readObject(Accumulators.scala:151)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
....
17/10/26 10:17:34 ERROR TaskResultGetter: Could not deserialize TaskEndReason: ClassNotFound with classloader ClasspathFilter(
parent = URLClassLoader with NativeCopyLoader with RawResources(
urls = Vector(/tmp/sbt_26f6c00a/job-1/target/f4c63b23/hellospark_2.10-1.0.jar,
....
17/10/26 10:17:34 ERROR Utils: uncaught error in thread SparkListenerBus, stopping SparkContext
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
....


// Also I got warning message keeps poping up in the screen repeatedly:

17/10/26 09:21:32 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,[Lscala.Tuple2;@cfb8451,BlockManagerId(driver, localhost, 44419))] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.askTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:448)
at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:468)
at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:468)
at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:468)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1741)
at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:468)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
... 15 more

New Contributor
Posts: 2
Registered: ‎10-24-2017

Re: Fail to run a simple Spark Application of sbt project in cloudera-quickstart-vm-5.12

Hmm,

it actually succeeds and produce the expected result if spark-submit the jar file.

So the problem is actually with sbt console? 

Announcements