Support Questions

Find answers, ask questions, and share your expertise

Kafka spark structuredstreaming

avatar
Contributor

Hi,

we are trying to implement kafka with spark structuredstreaming for that we imported the spark-sql-kafka-0-10_2.11-2.0.2 module using spark-shell while executing it shows below error:

pplyOrElse(StreamExecution.scala:153) at org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$4.apply(TreeNode.scala:306) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272) at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:256) at org.apache.spark.sql.execution.streaming.StreamExecution.logicalPlan$lzycompute(StreamExecution.scala:153) at org.apache.spark.sql.execution.streaming.StreamExecution.logicalPlan(StreamExecution.scala:147) at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$runBatches(StreamExecution.scala:276) at org.apache.spark.sql.execution.streaming.StreamExecution$anon$1.run(StreamExecution.scala:206) Caused by: java.lang.ClassNotFoundException: org.apache.kafka.common.serialization.ByteArrayDeserializer at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 47 more Exception in thread "main" org.apache.spark.sql.streaming.StreamingQueryException: org/apache/kafka/common/serialization/ByteArrayDeserializer === Streaming Query === Identifier: [id = f230ea99-b84b-44ef-ba49-e8ecb02ae810, runId = 82357c6a-306d-4e6f-96a8-6238302ff59d] Current Committed Offsets: {} Current Available Offsets: {} Current State: INITIALIZING Thread State: RUNNABLE at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$runBatches(StreamExecution.scala:343) at org.apache.spark.sql.execution.streaming.StreamExecution$anon$1.run(StreamExecution.scala:206) Caused by: java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArrayDeserializer at org.apache.spark.sql.kafka010.KafkaSourceProvider.createSource(KafkaSourceProvider.scala:74) at org.apache.spark.sql.execution.datasources.DataSource.createSource(DataSource.scala:246)

even when we running example using ./runexample command we are facing the same issue.

2 REPLIES 2

avatar
New Contributor

@subbiram Padala

Check the history server to confirm that the kafka-clients jar is included in your classpath. If it is not there, the following is the cause of your error most likely:

The missing class from the error is contained in the jar, and that error will arise of the jar is not included when starting the spark shell or a spark-submit. You can include it with --driver-class-path, the jar should be located in the libs folder in your kafka-brokers install folder.

avatar
Expert Contributor

I tried to add kafka-clients-1.0.0.jar into /usr/hdp/2.6.5.0-292/kafka/libs folder but it is no use.