Created 05-16-2018 03:10 PM
Hi, I'm doing the following tutorial
https://es.hortonworks.com/tutorial/deploying-machine-learning-models-using-spark-structured-streami...
I have reached this point of the tutorial: Then use spark-submit to deploy the jar to Spark.
I am trying to submit a job which is in target/main/scala, which is a jar file, with the following lines:
/usr/hdp/current/spark2-client/bin/spark-submit --class "main.scala.Collect" --master local[4] ./SentimentAnalysis-assembly-2.0.0.jar
All goes well, except the following errors:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. root |-- value: string (nullable = true) Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/execution/streaming/Source$class at org.apache.spark.sql.kafka010.KafkaSource.<init>(KafkaSource.scala:84) at org.apache.spark.sql.kafka010.KafkaSourceProvider.createSource(KafkaSourceProvider.scala:152) at org.apache.spark.sql.execution.datasources.DataSource.createSource(DataSource.scala:240) at org.apache.spark.sql.streaming.StreamingQueryManager$$anonfun$1.applyOrElse(StreamingQueryManager.scala:245) at org.apache.spark.sql.streaming.StreamingQueryManager$$anonfun$1.applyOrElse(StreamingQueryManager.scala:241) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:279) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:279) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:278) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:268) at org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:241) at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:287) at main.scala.Collect$.main(Collect.scala:90) at main.scala.Collect.main(Collect.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.execution.streaming.Source$class at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 47 more
my build.sbt file is:
name := "SentimentAnalysis" version := "2.0.0" scalaVersion := "2.10.5"//"2.10.4"// libraryDependencies ++= { val sparkVer = "2.1.0"//"1.6.1"// Seq( "org.apache.spark" %% "spark-core" % sparkVer % "provided" withSources(), "org.apache.spark" %% "spark-mllib" % sparkVer % "provided" withSources(), "org.apache.spark" %% "spark-sql" % sparkVer % "provided" withSources(), "org.apache.spark" %% "spark-streaming" % sparkVer % "provided" withSources(), "org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVer withSources(), "org.apache.spark" %% "spark-sql-kafka-0-10" % sparkVer withSources(), "org.apache.kafka" %% "kafka" % "0.10.0" withSources(), "com.typesafe" % "config" % "1.3.1", "com.google.code.gson" % "gson" % "2.8.0" ) } assemblyMergeStrategy in assembly := { case PathList("org", "apache", xs @ _*) => MergeStrategy.first case PathList("javax", "xml", xs @ _*) => MergeStrategy.first case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.first case PathList("com", "google", xs @ _*) => MergeStrategy.first case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) }
If anyone can help me thank you very much.
Created 05-16-2018 07:18 PM
@David Sandoval What version of HDP are you running this with? I believe the missing class was added starting HDP 2.6.1 only. I also noticed you are using spark 2.1 with scala 2.10 - Spark 2.1.0 uses Scala 2.11, so you should change this as well.
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 05-16-2018 10:34 PM
Hi, @Felix Albani
I'm using the HDP version is HDP-2.5.0.0-1245, the spark version is 1.6.2 and the scala version is 2.10.5.
Is it safe that the HDP version is HDP-2.5.0.0-1245 did not add the missing class?