Reply
Highlighted
New Contributor
Posts: 2
Registered: ‎12-30-2015

SparkJobServer not finding the org.apache.spark.sql.SQLContext

Hi I am trying the use JobServer to create SQLContext and share it with Java code to perform futher processing and storiong RDD as table. This table will be accessd by aother jon with same context.

 

But when I use below code to instantiate it I get exception. Can any one point to resolution.

 

 

CODE: 

=================================================

 

object GetOrCreateUsers extends UsersSparkJob {

override def runJob(sc: SparkContext, config: Config) = {
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
}
}

 

EXCEPTION: 

=================================================

job-server Starting spark.jobserver.JobServer.main()
job-server[ERROR] Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
[success] Total time: 2 s, completed Jan 14, 2016 7:03:49 PM
> job-server[ERROR] Uncaught error from thread [JobServer-akka.actor.default-dispatcher-5] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[JobServer]
job-server[ERROR] java.lang.NoClassDefFoundError: org/apache/spark/sql/SQLContext
job-server[ERROR] at sparking.jobserver.GetOrCreateUsers$.runJob(UsersSparkJobs.scala:20)
job-server[ERROR] at sparking.jobserver.GetOrCreateUsers$.runJob(UsersSparkJobs.scala:16)
job-server[ERROR] at spark.jobserver.JobManagerActor$$anonfun$spark$jobserver$JobManagerActor$$getJobFuture$4.apply(JobManagerActor.scala:222)
job-server[ERROR] at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
job-server[ERROR] at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
job-server[ERROR] at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:42)
job-server[ERROR] at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
job-server[ERROR] at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
job-server[ERROR] at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
job-server[ERROR] at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
job-server[ERROR] at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
job-server[ERROR] Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.SQLContext
job-server[ERROR] at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
job-server[ERROR] at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
job-server[ERROR] at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
job-server[ERROR] ... 11 more
job-server ... finished with exit code 255

=================================================

 

 

Configurations:  build.sbt 

=================================================

name := "spark-jobserver_Jan1416"

version := "1.0.0"

scalacOptions ++= Seq("-deprecation")

lazy val buildSettings = Seq(
version := "0.1-SNAPSHOT",
organization := "com.github.fedragon.sparking.jobserver",
scalaVersion := "2.10.4"
)

resolvers += "Ooyala Bintray" at "http://dl.bintray.com/ooyala/maven"

libraryDependencies ++= Seq (
"joda-time" % "joda-time" % "2.3",
"org.joda" % "joda-convert" % "1.2",
("org.apache.spark" %% "spark-core" % "1.1.1").
exclude("org.mortbay.jetty", "servlet-api").
exclude("commons-beanutils", "commons-beanutils-core").
exclude("commons-collections", "commons-collections").
exclude("com.esotericsoftware.minlog", "minlog").
exclude("junit", "junit").
exclude("org.slf4j", "log4j12"),
"org.apache.spark" %% "spark-sql" % "1.1.1",
"ooyala.cnd" % "job-server" % "0.4.0" % "provided"
)

=================================================

 

Configurations:  Cluster Configuration 

CDH 5.4.7, Parcels

SPARK: 1.3.0 

 

 

 It will be a great help if any pointers are available on how to use spark.jobserver.SparkJob and spark.jobserver.NamedRddSupport in JAVA code. 

 

Thankyou for your assistance

Cloudera Employee
Posts: 322
Registered: ‎01-16-2014

Re: SparkJobServer not finding the org.apache.spark.sql.SQLContext

The spark job server is not a Cloudera provided application. You will need to get support from the team that hosts the code at the main dev branch. That said I can see one huge problem: you try to use a job server version 0.4 which is for an older release of Spark (1.0.2) than you have in CDH 5.4 (1.3.x). Make sure that you use the proper version and fix your project compilation etc.

Also Spark in CDH uses a base version of Spark and adds fixes on top of that. You might thus need a slightly different version of the job server than you think. It is all up to you to make sure it works for your use case and is stable. We are working on an equivalent job server as part of CDH.

 

Wilfred