<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Kafka class not found error when running Atlas hook in Oozie in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312722#M225240</link>
    <description>&lt;P&gt;Please let me know, further any help is required on this issue.&lt;/P&gt;</description>
    <pubDate>Wed, 10 Mar 2021 06:32:14 GMT</pubDate>
    <dc:creator>RangaReddy</dc:creator>
    <dc:date>2021-03-10T06:32:14Z</dc:date>
    <item>
      <title>Kafka class not found error when running Atlas hook in Oozie</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312555#M225186</link>
      <description>&lt;P&gt;I am running a Spark action on Oozie in a Cloudera 7.1.4 cluster.&lt;/P&gt;&lt;P&gt;The action itself completes successfully but in the logs there is a stack trace showing the atlas hook failed. It looks like the application is looking for the Spark SQL Kafka 0.10 Jar.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;&amp;lt;&amp;lt;&amp;lt; Invocation of Main class completed &amp;lt;&amp;lt;&amp;lt;

2021-03-08 09:31:00,968 [SparkExecutionPlanProcessor-thread] WARN  com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor  - Caught exception during parsing event
java.lang.ClassNotFoundException: org.apache.spark.sql.kafka010.KafkaRelation
	at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:471)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
	at java.base/java.lang.Class.forName0(Native Method)
	at java.base/java.lang.Class.forName(Class.java:398)
	at com.hortonworks.spark.atlas.utils.ReflectionHelper$.classForName(ReflectionHelper.scala:113)
	at org.apache.spark.sql.kafka010.atlas.ExtractFromDataSource$.isKafkaRelation(ExtractFromDataSource.scala:187)
	at com.hortonworks.spark.atlas.sql.CommandsHarvester$KafkaEntities$.unapply(CommandsHarvester.scala:560)
	at com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:260)
	at com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:249)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
	at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
	at com.hortonworks.spark.atlas.sql.CommandsHarvester$.com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities(CommandsHarvester.scala:249)
	at com.hortonworks.spark.atlas.sql.CommandsHarvester$InsertIntoHadoopFsRelationHarvester$.harvest(CommandsHarvester.scala:76)
	at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:138)
	at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:97)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
	at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
	at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:97)
	at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:71)
	at com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:97)
	at com.hortonworks.spark.atlas.AbstractEventProcessor$$anon$1.run(AbstractEventProcessor.scala:46)
Oozie Launcher, uploading action data to HDFS sequence file: hdfs://XXXX:8020/user/XXXX/oozie-oozi/0000007-210307120244183-oozie-oozi-W/XXXX--spark/action-data.seq
2021-03-08 09:31:01,011 [main] INFO  org.apache.hadoop.io.compress.CodecPool  - Got brand-new compressor [.deflate]
Stopping AM
2021-03-08 09:31:01,039 [main] INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl  - Waiting for application to be successfully unregistered.
Callback notification attempts left 0
Callback notification trying http://XXXX/oozie/callback?id=0000007-210307120244183-oozie-oozi-W@XXXX&amp;amp;status=SUCCEEDED
Callback notification to http://XXXX/oozie/callback?id=0000007-210307120244183-oozie-oozi-W@XXXX&amp;amp;status=SUCCEEDED succeeded
Callback notification succeeded
2021-03-08 09:31:01,174 [shutdown-hook-0] INFO  org.apache.spark.SparkContext  - Invoking stop() from shutdown hook
2021-03-08 09:31:01,175 [spark-listener-group-shared] INFO  com.hortonworks.spark.atlas.SparkAtlasEventTracker  - Receiving application end event - shutting down SAC
2021-03-08 09:31:01,179 [spark-listener-group-shared] INFO  com.hortonworks.spark.atlas.SparkAtlasEventTracker  - Done shutting down SAC
2021-03-08 09:31:01,810 [dispatcher-event-loop-4] INFO  org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnDriverEndpoint  - Registered executor NettyRpcEndpointRef(spark-client://Executor) (172.30.10.77:33544) with ID 3&lt;/LI-CODE&gt;&lt;P&gt;I can see that this Jar exists in the parcel on the Oozie host.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;$ pwd
/opt/cloudera/parcels/CDH/jars

$ ls -l spark*kafka*
-rw-r--r--. 1 root root 538452 Oct  6 07:14 spark-sql-kafka-0-10_2.11-2.4.0.7.1.4.0-203.jar
-rw-r--r--. 1 root root 216594 Oct  6 07:15 spark-streaming-kafka-0-10_2.11-2.4.0.7.1.4.0-203.jar&lt;/LI-CODE&gt;&lt;P&gt;Looking through the rest of the log I can see that Oozie loads the Spark Streaming Kafka jar, but not the SQL one.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;2021-03-08 09:30:42,979 [main] INFO  org.apache.spark.deploy.yarn.Client  - Source and destination file systems are the same. Not copying hdfs://XXXX:8020/user/oozie/share/lib/lib_20210221143618/spark/spark-streaming-kafka-0-10_2.11-2.4.0.7.1.4.0-203.jar&lt;/LI-CODE&gt;&lt;P&gt;How can I fix this?&lt;/P&gt;</description>
      <pubDate>Tue, 21 Apr 2026 09:06:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312555#M225186</guid>
      <dc:creator>hindmasj</dc:creator>
      <dc:date>2026-04-21T09:06:25Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka class not found error when running Atlas hook in Oozie</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312559#M225187</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;Have you included the jar in the spark-submit/spark shell command as below (comma separated for multiple jars)&lt;BR /&gt;$ bin/spark-submit &lt;SPAN class="c-mrkdwn__highlight"&gt;--jars&lt;/SPAN&gt; &amp;lt;spark-streaming-kafka-0-8-assembly.jar&amp;gt;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Mar 2021 08:18:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312559#M225187</guid>
      <dc:creator>Nandinin</dc:creator>
      <dc:date>2021-03-08T08:18:49Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka class not found error when running Atlas hook in Oozie</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312560#M225188</link>
      <description>&lt;P&gt;Hi Nanindin,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is running inside Oozie as a spark action, so is not using a spark submit.&lt;/P&gt;&lt;P&gt;I am not running the atlas hook as part of my application, Oozie is running it itself as part of the tear down at the end of the task. So I would expect Oozie to be in control of its own environment.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Steve&lt;/P&gt;</description>
      <pubDate>Mon, 08 Mar 2021 08:26:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312560#M225188</guid>
      <dc:creator>hindmasj</dc:creator>
      <dc:date>2021-03-08T08:26:48Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka class not found error when running Atlas hook in Oozie</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312644#M225214</link>
      <description>&lt;P&gt;Hello&amp;nbsp;hindmasj,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Could you please check, Atlas service is enabled in the Spark service or not?&lt;/P&gt;&lt;P&gt;Spark --&amp;gt; Configuration --&amp;gt;&amp;nbsp;Atlas Service&amp;nbsp;&lt;/P&gt;&lt;P&gt;If Atlas service is enabled then Spark internally requires the&amp;nbsp;spark-sql-kafka-0-10_2.11-2.4.0.7.1.4.0-203.jar file.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Copy the jar to hdfs and update in Spark action in oozie&lt;/P&gt;&lt;P&gt;&amp;lt;jar&amp;gt;hdfs://host/path/to/spark-sql-kafka-0-10_2.11-2.4.0.7.1.4.0-203.jar&amp;lt;/jar&amp;gt;&lt;/P&gt;&lt;P&gt;If Atlas service not required please disable and run the Spark job.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Mar 2021 06:58:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312644#M225214</guid>
      <dc:creator>RangaReddy</dc:creator>
      <dc:date>2021-03-09T06:58:20Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka class not found error when running Atlas hook in Oozie</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312677#M225220</link>
      <description>&lt;P&gt;Hi Range,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;I checked and the atlas service is configured for Spark. I think we need it for Ranger? I thought about your suggestion about adding it to the workflow, but if that was the case we would need to add it to every workflow, as using the atlas hook is an administrative decision, not a developer one.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Anyway I did some reading around and saw that Oozie builds a shared library, in oozie/share/lib/lib_&amp;lt;date&amp;gt; and that under there all of the spark jars in a spark directory. The sql-kafka jar is not in there. I tried to rebuild the library with oozie -&amp;gt; actions -&amp;gt; Install Oozie Share Lib, but the new library did not contain the required file either.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;So I found the right jar in parcels/CDH/jars and copied that to the library. That still did not work and I noticed that the Spark job was still being built with the old library. At this point I should have restarted Oozie, but I did not know that and Cloudera Manager did not indicate it was required. So I deleted the old library and then the job was built with the new library. This however was a big mistake as the new job could not register a spark listener. The error was&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Caused by: org.apache.atlas.AtlasException: Failed to load application properties
	at org.apache.atlas.ApplicationProperties.get(ApplicationProperties.java:147)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It seems that the recreation of the share lib did not include the file atlas-application.properties. It took me a few hours and various restarts of services and redeployment of client configs before I discovered this was the root cause. I manually added the file to the share lib and restarted Oozie again. After this I could run my job and the error about the atlas hook failing was gone.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think if I had any recommendations they would be&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. If you enable the spark atlas hook then Oozie should include the atlas properties file in the share lib. In fact it should probably do this by default.&lt;BR /&gt;2. Likewise, the share lib should include the spark-sql-kafka jar.&lt;BR /&gt;3. When you rebuild the shared lib then Oozie should be flagged by Cloudera Manager as requiring a restart.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;BR /&gt;Steve&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Mar 2021 10:10:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312677#M225220</guid>
      <dc:creator>hindmasj</dc:creator>
      <dc:date>2021-03-09T10:10:59Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka class not found error when running Atlas hook in Oozie</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312681#M225224</link>
      <description>&lt;P&gt;Hi Steve,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you enable Atlas service in Spark, then there will be two flows&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;---&amp;gt; Your Application Flow&amp;nbsp;&lt;/P&gt;&lt;P&gt;Spark&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;--&amp;gt; Write Spark events to Kafka --&amp;gt; HBase --&amp;gt; This HBase data is visualised in Atlas UI.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please check with admin team, for Spark Atlas service is required or not. If it is not required then disable in the Spark UI. Then you will not see any issues in oozie. Mean while have you tried to submit the same job without oozie?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Mar 2021 11:43:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312681#M225224</guid>
      <dc:creator>RangaReddy</dc:creator>
      <dc:date>2021-03-09T11:43:31Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka class not found error when running Atlas hook in Oozie</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312721#M225239</link>
      <description>&lt;P&gt;Hi Ranga,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am pretty sure it is required as part of our data governance policy but thank you for the tip.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Steve&lt;/P&gt;</description>
      <pubDate>Wed, 10 Mar 2021 06:26:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312721#M225239</guid>
      <dc:creator>hindmasj</dc:creator>
      <dc:date>2021-03-10T06:26:01Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka class not found error when running Atlas hook in Oozie</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312722#M225240</link>
      <description>&lt;P&gt;Please let me know, further any help is required on this issue.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Mar 2021 06:32:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312722#M225240</guid>
      <dc:creator>RangaReddy</dc:creator>
      <dc:date>2021-03-10T06:32:14Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka class not found error when running Atlas hook in Oozie</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312723#M225241</link>
      <description>&lt;P&gt;Summarising all of the above, assuming you need the spark - atlas hook in your system, the solution is as follows.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;You need to add the file &lt;EM&gt;atlas-application.properties&lt;/EM&gt; and &lt;EM&gt;spark-sql-kafka-0-10_2.11-&amp;lt;version&amp;gt;.jar&lt;/EM&gt; to the Oozie shared spark&amp;nbsp; library.&lt;/LI&gt;&lt;LI&gt;The library is located on HDFS at &lt;EM&gt;&amp;lt;home&amp;gt;/oozie/share/lib/lib_&amp;lt;date&amp;gt;/spark&lt;/EM&gt;.&lt;/LI&gt;&lt;LI&gt;The application properties file can be found in various places such as &lt;EM&gt;/etc/hive/conf.cloudera.hive/&lt;/EM&gt;.&lt;/LI&gt;&lt;LI&gt;The jar file can be found in the parcels directory in&lt;EM&gt; /opt/cloudera/parcels/CDH/jars/&lt;/EM&gt;.&lt;/LI&gt;&lt;LI&gt;When you copy the files they need to be owned by &lt;EM&gt;oozie.oozie&lt;/EM&gt; and be world readable.&lt;/LI&gt;&lt;LI&gt;If you make any changes to the shared library you must then restart the Oozie server before jobs can find the new files.&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Wed, 10 Mar 2021 06:43:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-class-not-found-error-when-running-Atlas-hook-in-Oozie/m-p/312723#M225241</guid>
      <dc:creator>hindmasj</dc:creator>
      <dc:date>2021-03-10T06:43:17Z</dc:date>
    </item>
  </channel>
</rss>

