I am currently testing out cdsw 1.8,0 on a cludera express 5.16.2 cluster. Most things seem to work ok, except for the scala workbench - it starts up, but before the engine is fully loaded, it crashes with exit code 1.
The log contains the following (rather puzzling) stack trace:
Exception in thread "main" java.lang.NoSuchMethodError: joptsimple.OptionParser.acceptsAll(Ljava/util/Collection;Ljava/lang/String;)Ljoptsimple/OptionSpecBuilder; at org.apache.toree.boot.CommandLineOptions.(CommandLineOptions.scala:37) at org.apache.toree.Main$.delayedEndpoint$org$apache$toree$Main$1(Main.scala:25) at org.apache.toree.Main$delayedInit$body.apply(Main.scala:24) at scala.Function0$class.apply$mcV$sp(Function0.scala:40) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App$$anonfun$main$1.apply(App.scala:71) at scala.App$$anonfun$main$1.apply(App.scala:71) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32) at scala.App$class.main(App.scala:71) at org.apache.toree.Main$.main(Main.scala:24) at org.apache.toree.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:730) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Hello, are you still having issues with this?
What engine image are you using?
Can you ensure that there are Spark gateways on your CDSW nodes? The Scala kernel is different than the Python and R kernels in that it makes a connection to Spark upon start up instead of waiting until you create the spark context.
Hi Mike, thanks for replying
Yes, the problem persists. In order to test if this was a CDSW 1.8.0 bug, I upgraded to CDSW 1.8.1 so I have tried with two engine-deps (tags: 5eeadfc and 19e3ca6), and engine image 13 (for 1.8.0 I also attempted engine 13-python3.7). No luck with any of this.
However, you state that I should have spark running on the CDSW nodes. I am actually testing CDSW for my company, so I have been working on a POC CDH 5.16 installation, in which I could never get spark parcels to install. Do you believe that to be the root cause? If so, an error message complaining of a lacking function Joptsimple seems a unclear
Hey sorry for the delay in getting back to you. You don't need to have spark installed on the CDSW nodes, but you DO need to have the Spark Gateway roles installed. This is how CDSW picks up the configuration for Spark and knows which jar files to use, etc. If you have this working with Python, does that mean you can run the "pi.py" example from a Python session which uses Spark? If so then the problem is not the gateways. If spark is not installed then yes I think this is probably the problem since the Scala workbench tries to create a SparkSession object immediately. This might be why your stack trace has lots of spark paths in it near the end.