About hindmasj

hindmasj · ‎03-09-2021

Summarising all of the above, assuming you need the spark - atlas hook in your system, the solution is as follows. You need to add the file atlas-application.properties and spark-sql-kafka-0-10_2.11-<version>.jar to the Oozie shared spark library. The library is located on HDFS at <home>/oozie/share/lib/lib_<date>/spark. The application properties file can be found in various places such as /etc/hive/conf.cloudera.hive/. The jar file can be found in the parcels directory in /opt/cloudera/parcels/CDH/jars/. When you copy the files they need to be owned by oozie.oozie and be world readable. If you make any changes to the shared library you must then restart the Oozie server before jobs can find the new files.

hindmasj · ‎03-09-2021

Hi Ranga, I am pretty sure it is required as part of our data governance policy but thank you for the tip. Regards Steve

hindmasj · ‎03-09-2021

Hi Range, I checked and the atlas service is configured for Spark. I think we need it for Ranger? I thought about your suggestion about adding it to the workflow, but if that was the case we would need to add it to every workflow, as using the atlas hook is an administrative decision, not a developer one. Anyway I did some reading around and saw that Oozie builds a shared library, in oozie/share/lib/lib_<date> and that under there all of the spark jars in a spark directory. The sql-kafka jar is not in there. I tried to rebuild the library with oozie -> actions -> Install Oozie Share Lib, but the new library did not contain the required file either. So I found the right jar in parcels/CDH/jars and copied that to the library. That still did not work and I noticed that the Spark job was still being built with the old library. At this point I should have restarted Oozie, but I did not know that and Cloudera Manager did not indicate it was required. So I deleted the old library and then the job was built with the new library. This however was a big mistake as the new job could not register a spark listener. The error was Caused by: org.apache.atlas.AtlasException: Failed to load application properties at org.apache.atlas.ApplicationProperties.get(ApplicationProperties.java:147) It seems that the recreation of the share lib did not include the file atlas-application.properties. It took me a few hours and various restarts of services and redeployment of client configs before I discovered this was the root cause. I manually added the file to the share lib and restarted Oozie again. After this I could run my job and the error about the atlas hook failing was gone. I think if I had any recommendations they would be 1. If you enable the spark atlas hook then Oozie should include the atlas properties file in the share lib. In fact it should probably do this by default. 2. Likewise, the share lib should include the spark-sql-kafka jar. 3. When you rebuild the shared lib then Oozie should be flagged by Cloudera Manager as requiring a restart. Regards Steve

hindmasj · ‎03-08-2021

Hi Nanindin, This is running inside Oozie as a spark action, so is not using a spark submit. I am not running the atlas hook as part of my application, Oozie is running it itself as part of the tear down at the end of the task. So I would expect Oozie to be in control of its own environment. Regards Steve

hindmasj · ‎03-07-2021

I am running a Spark action on Oozie in a Cloudera 7.1.4 cluster. The action itself completes successfully but in the logs there is a stack trace showing the atlas hook failed. It looks like the application is looking for the Spark SQL Kafka 0.10 Jar. <<< Invocation of Main class completed <<< 2021-03-08 09:31:00,968 [SparkExecutionPlanProcessor-thread] WARN com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor - Caught exception during parsing event java.lang.ClassNotFoundException: org.apache.spark.sql.kafka010.KafkaRelation at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:471) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) at java.base/java.lang.Class.forName0(Native Method) at java.base/java.lang.Class.forName(Class.java:398) at com.hortonworks.spark.atlas.utils.ReflectionHelper$.classForName(ReflectionHelper.scala:113) at org.apache.spark.sql.kafka010.atlas.ExtractFromDataSource$.isKafkaRelation(ExtractFromDataSource.scala:187) at com.hortonworks.spark.atlas.sql.CommandsHarvester$KafkaEntities$.unapply(CommandsHarvester.scala:560) at com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:260) at com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:249) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at com.hortonworks.spark.atlas.sql.CommandsHarvester$.com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities(CommandsHarvester.scala:249) at com.hortonworks.spark.atlas.sql.CommandsHarvester$InsertIntoHadoopFsRelationHarvester$.harvest(CommandsHarvester.scala:76) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:138) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:97) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:97) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:71) at com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:97) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anon$1.run(AbstractEventProcessor.scala:46) Oozie Launcher, uploading action data to HDFS sequence file: hdfs://XXXX:8020/user/XXXX/oozie-oozi/0000007-210307120244183-oozie-oozi-W/XXXX--spark/action-data.seq 2021-03-08 09:31:01,011 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new compressor [.deflate] Stopping AM 2021-03-08 09:31:01,039 [main] INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl - Waiting for application to be successfully unregistered. Callback notification attempts left 0 Callback notification trying http://XXXX/oozie/callback?id=0000007-210307120244183-oozie-oozi-W@XXXX&status=SUCCEEDED Callback notification to http://XXXX/oozie/callback?id=0000007-210307120244183-oozie-oozi-W@XXXX&status=SUCCEEDED succeeded Callback notification succeeded 2021-03-08 09:31:01,174 [shutdown-hook-0] INFO org.apache.spark.SparkContext - Invoking stop() from shutdown hook 2021-03-08 09:31:01,175 [spark-listener-group-shared] INFO com.hortonworks.spark.atlas.SparkAtlasEventTracker - Receiving application end event - shutting down SAC 2021-03-08 09:31:01,179 [spark-listener-group-shared] INFO com.hortonworks.spark.atlas.SparkAtlasEventTracker - Done shutting down SAC 2021-03-08 09:31:01,810 [dispatcher-event-loop-4] INFO org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnDriverEndpoint - Registered executor NettyRpcEndpointRef(spark-client://Executor) (172.30.10.77:33544) with ID 3 I can see that this Jar exists in the parcel on the Oozie host. $ pwd /opt/cloudera/parcels/CDH/jars $ ls -l spark*kafka* -rw-r--r--. 1 root root 538452 Oct 6 07:14 spark-sql-kafka-0-10_2.11-2.4.0.7.1.4.0-203.jar -rw-r--r--. 1 root root 216594 Oct 6 07:15 spark-streaming-kafka-0-10_2.11-2.4.0.7.1.4.0-203.jar Looking through the rest of the log I can see that Oozie loads the Spark Streaming Kafka jar, but not the SQL one. 2021-03-08 09:30:42,979 [main] INFO org.apache.spark.deploy.yarn.Client - Source and destination file systems are the same. Not copying hdfs://XXXX:8020/user/oozie/share/lib/lib_20210221143618/spark/spark-streaming-kafka-0-10_2.11-2.4.0.7.1.4.0-203.jar How can I fix this?

hindmasj · ‎01-24-2021

I eventually found the answer in this document. https://docs.cloudera.com/runtime/7.2.6/kafka-securing/topics/kafka-secure-kerberos-enable.html The steps you need are 1: Create a jaas.conf file to describe how you will kerberise. Either interactively with kinit KafkaClient { com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true; }; or non-interactively with a keytab KafkaClient { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="/etc/security/keytabs/mykafkaclient.keytab" principal="mykafkaclient/clients.hostname.com@EXAMPLE.COM"; }; 2: Create a client properties file to describe how you will authenticate Either with TLS security.protocol=SASL_SSL sasl.kerberos.service.name=kafka ssl.truststore.location=<path to jks file> ssl.truststore.password=<password for truststore> Or without security.protocol=SASL_PLAINTEXT sasl.kerberos.service.name=kafka 3: Create the environment variable KAFKA_OPTS to to contain the JVM parameter export KAFKA_OPTS="-Djava.security.auth.login.config=<path to jaas.conf>" Then you can run the tool by referencing the Kafka brokers and the client config. BOOTSTRAP=<kafka brokers URL> kafka-topics --bootstrap-server $BOOTSTRAP --command-config client.properties --list You will also need a Ranger policy that covers what you are trying to do.

hindmasj · ‎01-12-2021

I have done the following Created a new group that is visible to Ranger and added my user to it. Added that group to an existing Kafka-Ranger policy that has full permissions on the Kafka cluster Kinit my user when logged onto the Kafka broker. Run the script again. I still get the same error message.

hindmasj · ‎01-10-2021

I am trying to run what in my old cluster was a simple command kafka-topics --bootstrap-server mybroker:9092 --list Now with a CDP7 cluster, and Ranger installed, I get the following error. What essential thing am I missing here. Do I need a certain Ranger policy? Does this user have to be Kerberized? Is it something else? I am trying to reuse old management scripts to create my topics, but they all rely on getting the kafka-topics script to work. I have turned TLS off for the brokers. 21/01/10 13:47:49 INFO utils.Log4jControllerRegistration$: Registered kafka:type=kafka.Log4jController MBean 21/01/10 13:47:49 INFO admin.AdminClientConfig: AdminClientConfig values: bootstrap.servers = [mybroker:9092] client.dns.lookup = default client.id = connections.max.idle.ms = 300000 metadata.max.age.ms = 300000 metric.reporters = [] metrics.num.samples = 2 metrics.recording.level = INFO metrics.sample.window.ms = 30000 receive.buffer.bytes = 65536 reconnect.backoff.max.ms = 1000 reconnect.backoff.ms = 50 request.timeout.ms = 120000 retries = 5 retry.backoff.ms = 100 sasl.client.callback.handler.class = null sasl.jaas.config = null sasl.kerberos.kinit.cmd = /usr/bin/kinit sasl.kerberos.min.time.before.relogin = 60000 sasl.kerberos.service.name = null sasl.kerberos.ticket.renew.jitter = 0.05 sasl.kerberos.ticket.renew.window.factor = 0.8 sasl.login.callback.handler.class = null sasl.login.class = null sasl.login.refresh.buffer.seconds = 300 sasl.login.refresh.min.period.seconds = 60 sasl.login.refresh.window.factor = 0.8 sasl.login.refresh.window.jitter = 0.05 sasl.mechanism = GSSAPI security.protocol = PLAINTEXT security.providers = null send.buffer.bytes = 131072 ssl.cipher.suites = null ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] ssl.endpoint.identification.algorithm = https ssl.key.password = null ssl.keymanager.algorithm = SunX509 ssl.keystore.location = null ssl.keystore.password = null ssl.keystore.type = JKS ssl.protocol = TLS ssl.provider = null ssl.secure.random.implementation = null ssl.trustmanager.algorithm = PKIX ssl.truststore.location = null ssl.truststore.password = null ssl.truststore.type = JKS 21/01/10 13:47:49 INFO utils.AppInfoParser: Kafka version: 2.4.1.7.1.4.0-203 21/01/10 13:47:49 INFO utils.AppInfoParser: Kafka commitId: 79e841231b59b25d 21/01/10 13:47:49 INFO utils.AppInfoParser: Kafka startTimeMs: 1610275669932 21/01/10 13:47:52 INFO internals.AdminMetadataManager: [AdminClient clientId=adminclient-1] Metadata update failed org.apache.kafka.common.errors.DisconnectException: Cancelled fetchMetadata request with correlation id 11 due to node -1 being disconnected ... Error while executing topic command : org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. 21/01/10 13:49:49 ERROR admin.TopicCommand$: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45) at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32) at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260) at kafka.admin.TopicCommand$AdminClientTopicService.getTopics(TopicCommand.scala:313) at kafka.admin.TopicCommand$AdminClientTopicService.listTopics(TopicCommand.scala:249) at kafka.admin.TopicCommand$.main(TopicCommand.scala:65) at kafka.admin.TopicCommand.main(TopicCommand.scala) Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. 21/01/10 13:49:49 INFO internals.AdminMetadataManager: [AdminClient clientId=adminclient-1] Metadata update failed org.apache.kafka.common.errors.TimeoutException: The AdminClient thread has exited.

hindmasj · ‎09-24-2016

Thanks. I followed the solution at Docker Installation On Fedora to install Docker's own yum repo. This allowed to me to upgrade to 1.12 and the clusterdock solution worked after that.

hindmasj · ‎09-24-2016

Even after pulling the image manually the script tries to pull it again and gets the same error. I am not sure what part of the script causing this or how to override it.

Online	Offline
Last Visited	‎05-03-2021 03:04 AM

Member Since	‎09-24-2016 03:18 AM
Last Visited	‎05-03-2021 03:04 AM
Posts	11
Kudos received	2

Cloudera Community

Re: Kafka class not found error when running Atlas...

Re: Kafka class not found error when running Atlas...

Re: Kafka class not found error when running Atlas...

Re: Kafka class not found error when running Atlas...

Re: Kafka class not found error when running Atlas...

Kafka class not found error when running Atlas hoo...

Re: Using The Kafka-Topics Script In CDP7

Re: Using The Kafka-Topics Script In CDP7

Using The Kafka-Topics Script In CDP7

Re: Getting "client is newer that server" error wh...

Re: Getting "client is newer that server" error wh...