Member since
03-19-2018
12
Posts
0
Kudos Received
0
Solutions
05-22-2018
02:38 PM
Hi, I'm doing the following tutoria.
https://es.hortonworks.com/tutorial/deploying-machine-learning-models-using-spark-structured-streaming/#deploying-the-model I'm using the HDP version is HDP-2.5.0.0-1245, the spark version is 1.6.2 and the scala version is 2.10.5. I have reached this point of the tutorial: Then use spark-submit to deploy the jar to Spark. I am trying to submit a job which is in target/main/scala, which is a jar file, with the following lines: /usr/hdp/current/spark2-client/bin/spark-submit --class "main.scala.Collect" --master local[4] ./SentimentAnalysis-assembly-2.0.0.jar All goes well, except the following errors: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: kafka. Please find packages at http://spark-packages.org
at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:145)
at org.apache.spark.sql.execution.datasources.DataSource.providingClass$lzycompute(DataSource.scala:78)
at org.apache.spark.sql.execution.datasources.DataSource.providingClass(DataSource.scala:78)
at org.apache.spark.sql.execution.datasources.DataSource.sourceSchema(DataSource.scala:195)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo$lzycompute(DataSource.scala:79)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo(DataSource.scala:79)
at org.apache.spark.sql.execution.streaming.StreamingRelation$.apply(StreamingRelation.scala:30)
at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:142)
at main.scala.Collect$.main(Collect.scala:61)
at main.scala.Collect.main(Collect.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: kafka.DefaultSource
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:130)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:130)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:130)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:130)
at scala.util.Try.orElse(Try.scala:84)
at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:130)
... 18 more My build.sbt file is: name := "SentimentAnalysis"
version := "2.0.0"
scalaVersion := "2.10.5"//"2.10.4"//
libraryDependencies ++= {
val sparkVer = "2.1.0"//"1.6.1"//
Seq(
"org.apache.spark" %% "spark-core" % sparkVer % "provided" withSources(),
"org.apache.spark" %% "spark-mllib" % sparkVer % "provided" withSources(),
"org.apache.spark" %% "spark-sql" % sparkVer withSources(),
"org.apache.spark" %% "spark-streaming" % sparkVer % "provided" withSources(),
"org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVer withSources(),
"org.apache.spark" %% "spark-sql-kafka-0-10" % sparkVer withSources(),
"org.apache.kafka" %% "kafka" % "0.10.0" withSources(),
"com.typesafe" % "config" % "1.3.1",
"com.google.code.gson" % "gson" % "2.8.0"
)
}
assemblyMergeStrategy in assembly := {
case PathList("org", "apache", xs @ _*) => MergeStrategy.first
case PathList("META-INF", xs @ _*) => MergeStrategy.discard
case x => MergeStrategy.first
case PathList("javax", "xml", xs @ _*) => MergeStrategy.first
case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.first
case PathList("com", "google", xs @ _*) => MergeStrategy.first
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache Spark
05-16-2018
10:34 PM
Hi, @Felix Albani I'm using the HDP version is HDP-2.5.0.0-1245, the spark version is 1.6.2 and the scala version is 2.10.5. Is it safe that the HDP version is HDP-2.5.0.0-1245 did not add the missing class?
... View more
05-16-2018
03:10 PM
Hi, I'm doing the following tutorial https://es.hortonworks.com/tutorial/deploying-machine-learning-models-using-spark-structured-streaming/#deploying-the-model I have reached this point of the tutorial: Then use spark-submit to deploy the jar to Spark. I am trying to submit a job which is in target/main/scala, which is a jar file, with the following lines: /usr/hdp/current/spark2-client/bin/spark-submit --class "main.scala.Collect" --master local[4] ./SentimentAnalysis-assembly-2.0.0.jar All goes well, except the following errors: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
root
|-- value: string (nullable = true)
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/execution/streaming/Source$class
at org.apache.spark.sql.kafka010.KafkaSource.<init>(KafkaSource.scala:84)
at org.apache.spark.sql.kafka010.KafkaSourceProvider.createSource(KafkaSourceProvider.scala:152)
at org.apache.spark.sql.execution.datasources.DataSource.createSource(DataSource.scala:240)
at org.apache.spark.sql.streaming.StreamingQueryManager$$anonfun$1.applyOrElse(StreamingQueryManager.scala:245)
at org.apache.spark.sql.streaming.StreamingQueryManager$$anonfun$1.applyOrElse(StreamingQueryManager.scala:241)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:279)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:279)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:278)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:321)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:179)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:319)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:284)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:268)
at org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:241)
at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:287)
at main.scala.Collect$.main(Collect.scala:90)
at main.scala.Collect.main(Collect.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.execution.streaming.Source$class
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 47 more my build.sbt file is: name := "SentimentAnalysis"
version := "2.0.0"
scalaVersion := "2.10.5"//"2.10.4"//
libraryDependencies ++= {
val sparkVer = "2.1.0"//"1.6.1"//
Seq(
"org.apache.spark" %% "spark-core" % sparkVer % "provided" withSources(),
"org.apache.spark" %% "spark-mllib" % sparkVer % "provided" withSources(),
"org.apache.spark" %% "spark-sql" % sparkVer % "provided" withSources(),
"org.apache.spark" %% "spark-streaming" % sparkVer % "provided" withSources(),
"org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVer withSources(),
"org.apache.spark" %% "spark-sql-kafka-0-10" % sparkVer withSources(),
"org.apache.kafka" %% "kafka" % "0.10.0" withSources(),
"com.typesafe" % "config" % "1.3.1",
"com.google.code.gson" % "gson" % "2.8.0"
)
}
assemblyMergeStrategy in assembly := {
case PathList("org", "apache", xs @ _*) => MergeStrategy.first
case PathList("javax", "xml", xs @ _*) => MergeStrategy.first
case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.first
case PathList("com", "google", xs @ _*) => MergeStrategy.first
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
} If anyone can help me thank you very much.
... View more
Labels:
- Labels:
-
Apache Spark
05-15-2018
03:16 PM
@Wynner If I delete that
property from the processor, I have no problem in the PublishKafka but
in the processors ConsumeKafka and PutSolrContentStream I do not receive
the messages. The version I use of NIFI: 1.0.0-DEMO
... View more
05-15-2018
02:20 PM
Hi, @Wynner I am using the version HDP-2.5.0.0-1245. The processor configuration: the problem is in the configuration of this parameter ack.wait.time
... View more
05-15-2018
02:43 AM
I'm doing the following tutorial https://es.hortonworks.com/tutorial/deploying-machine-learning-models-using-spark-structured-streaming/#deploying-the-model , and at the moment of configuring Nifi to transmit tweets to Kafka. I get the following errror in the execution of the data flow: If anyone can help me thank you very much
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache NiFi