About dsandoval

dsandoval · ‎05-22-2018

Hi, I'm doing the following tutoria. https://es.hortonworks.com/tutorial/deploying-machine-learning-models-using-spark-structured-streaming/#deploying-the-model I'm using the HDP version is HDP-2.5.0.0-1245, the spark version is 1.6.2 and the scala version is 2.10.5. I have reached this point of the tutorial: Then use spark-submit to deploy the jar to Spark. I am trying to submit a job which is in target/main/scala, which is a jar file, with the following lines: /usr/hdp/current/spark2-client/bin/spark-submit --class "main.scala.Collect" --master local[4] ./SentimentAnalysis-assembly-2.0.0.jar All goes well, except the following errors: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: kafka. Please find packages at http://spark-packages.org at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:145) at org.apache.spark.sql.execution.datasources.DataSource.providingClass$lzycompute(DataSource.scala:78) at org.apache.spark.sql.execution.datasources.DataSource.providingClass(DataSource.scala:78) at org.apache.spark.sql.execution.datasources.DataSource.sourceSchema(DataSource.scala:195) at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo$lzycompute(DataSource.scala:79) at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo(DataSource.scala:79) at org.apache.spark.sql.execution.streaming.StreamingRelation$.apply(StreamingRelation.scala:30) at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:142) at main.scala.Collect$.main(Collect.scala:61) at main.scala.Collect.main(Collect.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: kafka.DefaultSource at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:130) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:130) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:130) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:130) at scala.util.Try.orElse(Try.scala:84) at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:130) ... 18 more My build.sbt file is: name := "SentimentAnalysis" version := "2.0.0" scalaVersion := "2.10.5"//"2.10.4"// libraryDependencies ++= { val sparkVer = "2.1.0"//"1.6.1"// Seq( "org.apache.spark" %% "spark-core" % sparkVer % "provided" withSources(), "org.apache.spark" %% "spark-mllib" % sparkVer % "provided" withSources(), "org.apache.spark" %% "spark-sql" % sparkVer withSources(), "org.apache.spark" %% "spark-streaming" % sparkVer % "provided" withSources(), "org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVer withSources(), "org.apache.spark" %% "spark-sql-kafka-0-10" % sparkVer withSources(), "org.apache.kafka" %% "kafka" % "0.10.0" withSources(), "com.typesafe" % "config" % "1.3.1", "com.google.code.gson" % "gson" % "2.8.0" ) } assemblyMergeStrategy in assembly := { case PathList("org", "apache", xs @ _*) => MergeStrategy.first case PathList("META-INF", xs @ _*) => MergeStrategy.discard case x => MergeStrategy.first case PathList("javax", "xml", xs @ _*) => MergeStrategy.first case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.first case PathList("com", "google", xs @ _*) => MergeStrategy.first case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) }

dsandoval · ‎05-16-2018

Hi, @Felix Albani I'm using the HDP version is HDP-2.5.0.0-1245, the spark version is 1.6.2 and the scala version is 2.10.5. Is it safe that the HDP version is HDP-2.5.0.0-1245 did not add the missing class?

dsandoval · ‎05-16-2018

Hi, I'm doing the following tutorial https://es.hortonworks.com/tutorial/deploying-machine-learning-models-using-spark-structured-streaming/#deploying-the-model I have reached this point of the tutorial: Then use spark-submit to deploy the jar to Spark. I am trying to submit a job which is in target/main/scala, which is a jar file, with the following lines: /usr/hdp/current/spark2-client/bin/spark-submit --class "main.scala.Collect" --master local[4] ./SentimentAnalysis-assembly-2.0.0.jar All goes well, except the following errors: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. root |-- value: string (nullable = true) Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/execution/streaming/Source$class at org.apache.spark.sql.kafka010.KafkaSource.<init>(KafkaSource.scala:84) at org.apache.spark.sql.kafka010.KafkaSourceProvider.createSource(KafkaSourceProvider.scala:152) at org.apache.spark.sql.execution.datasources.DataSource.createSource(DataSource.scala:240) at org.apache.spark.sql.streaming.StreamingQueryManager$$anonfun$1.applyOrElse(StreamingQueryManager.scala:245) at org.apache.spark.sql.streaming.StreamingQueryManager$$anonfun$1.applyOrElse(StreamingQueryManager.scala:241) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:279) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:279) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:278) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284) at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:268) at org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:241) at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:287) at main.scala.Collect$.main(Collect.scala:90) at main.scala.Collect.main(Collect.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.execution.streaming.Source$class at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 47 more my build.sbt file is: name := "SentimentAnalysis" version := "2.0.0" scalaVersion := "2.10.5"//"2.10.4"// libraryDependencies ++= { val sparkVer = "2.1.0"//"1.6.1"// Seq( "org.apache.spark" %% "spark-core" % sparkVer % "provided" withSources(), "org.apache.spark" %% "spark-mllib" % sparkVer % "provided" withSources(), "org.apache.spark" %% "spark-sql" % sparkVer % "provided" withSources(), "org.apache.spark" %% "spark-streaming" % sparkVer % "provided" withSources(), "org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVer withSources(), "org.apache.spark" %% "spark-sql-kafka-0-10" % sparkVer withSources(), "org.apache.kafka" %% "kafka" % "0.10.0" withSources(), "com.typesafe" % "config" % "1.3.1", "com.google.code.gson" % "gson" % "2.8.0" ) } assemblyMergeStrategy in assembly := { case PathList("org", "apache", xs @ _*) => MergeStrategy.first case PathList("javax", "xml", xs @ _*) => MergeStrategy.first case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.first case PathList("com", "google", xs @ _*) => MergeStrategy.first case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) } If anyone can help me thank you very much.

dsandoval · ‎05-15-2018

@Wynner, it helped me thank you very much.

dsandoval · ‎05-15-2018

@Wynner The processor configuration: ConsumeKafka

dsandoval · ‎05-15-2018

@Wynner If I delete that property from the processor, I have no problem in the PublishKafka but in the processors ConsumeKafka and PutSolrContentStream I do not receive the messages. The version I use of NIFI: 1.0.0-DEMO

dsandoval · ‎05-15-2018

Hi, @Wynner I am using the version HDP-2.5.0.0-1245. The processor configuration: the problem is in the configuration of this parameter ack.wait.time

dsandoval · ‎05-15-2018

I'm doing the following tutorial https://es.hortonworks.com/tutorial/deploying-machine-learning-models-using-spark-structured-streaming/#deploying-the-model , and at the moment of configuring Nifi to transmit tweets to Kafka. I get the following errror in the execution of the data flow: If anyone can help me thank you very much

Online	Offline
Last Visited	‎05-23-2018 09:37 PM

Member Since	‎03-19-2018 10:55 PM
Last Visited	‎05-23-2018 09:37 PM
Posts	12

Cloudera Community

Spark-submit to deploy the jar to Spark - java.lan...

Re: Spark-submit to deploy the jar to Spark- java....

Spark-submit to deploy the jar to Spark- java.lang...

Re: NiFi PublishKafka_0_10 processor

Re: NiFi PublishKafka_0_10 processor

Re: NiFi PublishKafka_0_10 processor

Re: NiFi PublishKafka_0_10 processor

NiFi PublishKafka_0_10 processor