Member since
12-04-2014
8
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
21850 | 12-07-2014 07:23 AM |
12-07-2014
07:23 AM
Just to follow up, check out this thread (http://community.cloudera.com/t5/Apache-Hadoop-Concepts-and/spark-jobserver-and-spark-jobs-in-Hue-on-CDH5-2/m-p/22410#M1369) where I detail re-building the spark-jobserver and getting things to work. So, it does look like the problem I encountered was due to the CDH5.2 QuickStart VM having a version of the spark-jobserver that was compiled against Spark 0.9.1 causing the error I encountered due to incompatibilities with Spark 1.1.0. Thanks, Chris
... View more
12-05-2014
08:25 AM
I'm not building the spark-jobserver myself...however it certainly seems to be included in the VM in the /var/lib/cloudera-quickstart/spark-jobserver directory along with some nohup output indicating it is likely running [cloudera@quickstart spark-jobserver]$ pwd
/var/lib/cloudera-quickstart/spark-jobserver
[cloudera@quickstart spark-jobserver]$ ls
gc.out nohup.out settings.sh
log4j-server.properties server_start.sh spark-job-server.jar There is also .pid file indicating the spark-jobserver is running which the pid matching up with jps...so it certainly seems to be running: [cloudera@quickstart cloudera-quickstart]$ tail spark-jobserver.pid
1869
[cloudera@quickstart cloudera-quickstart]$ sudo jps | grep JobServer
1869 JobServer
... View more
12-05-2014
07:55 AM
I'm just using the CDH5.2 quickstart vm. As far as I could see the spark-jobserver was already included and running. Maybe I'm mistaken? And why do I need to execute with spark-submit? I should just be able to navigate to the Spark Editor in Hue...upload my jar and execute it there, correct? (Under the covers it is of course doing something)
... View more
12-05-2014
07:52 AM
Couldn't find a way to edit my post...but I meant I didn't yet verify which version of Spark the spark-jobserver examples would be built by default.
... View more
12-05-2014
07:49 AM
Yep...I already tried building against the exact CDH 5.2 artifacts...from my pom.xml <dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.1.0-cdh5.2.0</version>
<scope>provided</scope>
</dependency> so I had to throw in the Cloudera repository in the pom.xml to reference that artifact. I checked the generated jar after packaging with maven and verified that the only thing that was included was my simple test class. I've modified my simple test class (Java) to try different approaches...1.) Just having a main() method setting up the spark context in such a way that it works fine with spark-submit, and 2.) changing the class to implement SparkJob so it should be runnable by the jobserver and be provided the SparkContext Since I'm just using Java I'm not referencing any Scala libraries and not using the scala build tools...but just in case that was messing something up I setup the scala build tools and that is when I went to compile the spark-jobserver tests themselves and trying to run the spark.jobserver.WordCountExample. I didn't get a chance to verify which version of Scala those examples would be built against by default though...so its possible they were built against an older version.
... View more
12-04-2014
10:11 PM
I downloaded the Quick Start VM with CDH 5.2.x so that I could try out CDH 5.2 on VirtualBox. I can run a Spark job using the command line, but I've been pretty frustrated trying to get a Spark job to run use the Spark app in Hue (http://quickstart.cloudera:8888/spark). No matter what I try I get a NoSuchMethodError: Your application has the following error(s):
{ "status": "ERROR", "result": { "errorClass": "java.lang.RuntimeException", "cause": "org.apache.spark.SparkContext$.$lessinit$greater$default$2()Lscala/collection/Map;", "stack": ["spark.jobserver.JobManagerActor.createContextFromConfig(JobManagerActor.scala:241)", "spark.jobserver.JobManagerActor$$anonfun$wrappedReceive$1.applyOrElse(JobManagerActor.scala:94)", "scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)", "scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)", "scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)", "ooyala.common.akka.ActorStack$$anonfun$receive$1.applyOrElse(ActorStack.scala:33)", "scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)", "scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)", "scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)", "ooyala.common.akka.Slf4jLogging$$anonfun$receive$1$$anonfun$applyOrElse$1.apply$mcV$sp(Slf4jLogging.scala:26)", "ooyala.common.akka.Slf4jLogging$class.ooyala$common$akka$Slf4jLogging$$withAkkaSourceLogging(Slf4jLogging.scala:35)", "ooyala.common.akka.Slf4jLogging$$anonfun$receive$1.applyOrElse(Slf4jLogging.scala:25)", "scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)", "scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)", "scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)", "ooyala.common.akka.ActorMetrics$$anonfun$receive$1.applyOrElse(ActorMetrics.scala:24)", "akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)", "akka.actor.ActorCell.invoke(ActorCell.scala:456)", "akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)", "akka.dispatch.Mailbox.run(Mailbox.scala:219)", "akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)", "scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)", "scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)", "scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)", "scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)"], "causingClass": "java.lang.NoSuchMethodError", "message": "java.lang.NoSuchMethodError: org.apache.spark.SparkContext$.$lessinit$greater$default$2()Lscala/collection/Map;" } } (error 500) I've attempted to build my own apps following some of the spark-jobserver guides and have followed the second half of http://gethue.com/get-started-with-spark-deploy-spark-server-and-compute-pi-from-your-web-browser/ to clone the spark-jobobserver examples and build them. I find people complaining about something similar on the issues for spark-jobserver (https://github.com/ooyala/spark-jobserver/issues/29). Those issues do seem pending still possibly...and the originator of the issue indicated it started failing when moving from spark 0.9.1 to 1.0, and the CDH 5.2 QuickStart VM I have DOES seem to have Spark 1.1.0, but I'm giving Cloudera the benefit of the doubt that they didn't do something like bump the version of Spark in this VM, making it incompatible with spark-jobserver which is used in Hue. I would appreciate any guidance. I'd like to get something working so I can start playing around with kicking off jobs and getting results from an external app with HTTP to the spark-jobserver. Thanks!
... View more
Labels: