Support Questions

Find answers, ask questions, and share your expertise

Issue with Kafka after Kerberos was enabled: does the data go to trash because the node is unavailable?

avatar
Super Guru

Question:

For the issue described below: Does the data go to trash because the node is unavailable? What could cause this exception in the context of the recent cluster kerberos enabling?

Issue Description

Here is the issue an organization is facing with Kafka after recently enabling Kerberos in HDP 2.4.2 cluster.

They are trying to build a pipeline from Data Center to HDFS. The data is first being mirrored to the cluster using Mirror Maker .8 as the Data Center uses kafka .8. The data is then avro serialized using a Flume agent and dumped into HDFS through the Confluent HDFS connector.

However, from the MirrorMaker, they notice that only about half of the data is mirrored. Since Kerberos was enabled in their cluster, they are noticing the following error in the kafka logs:

[2016-08-29 16:51:28,479] INFO Returning HDFS Filesystem Config: Configuration: core-default.xml, core-site.xml, hdfs-default.xml, hdfs-site.xml (org.apache.ranger.audit.destination.HDFSAuditDestination)

[2016-08-29 16:51:28,496] ERROR Error writing to log file. (org.apache.ranger.audit.provider.BaseAuditHandler)

java.lang.IllegalArgumentException: java.net.UnknownHostException: xyzlphdpd1

at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:406)

at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:311)

at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)

at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)

at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)

at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)

at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)

at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)

at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)

at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)

at org.apache.ranger.audit.destination.HDFSAuditDestination.getLogFileStream(HDFSAuditDestination.java:221)

at org.apache.ranger.audit.destination.HDFSAuditDestination.logJSON(HDFSAuditDestination.java:123)

at org.apache.ranger.audit.queue.AuditFileSpool.sendEvent(AuditFileSpool.java:890)

at org.apache.ranger.audit.queue.AuditFileSpool.runDoAs(AuditFileSpool.java:838)

at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:759)

at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:757)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:356)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)

at org.apache.ranger.audit.queue.AuditFileSpool.run(AuditFileSpool.java:765)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.net.UnknownHostException: xyzlphdpd1

... 22 more

[2016-08-29 16:51:28,496] ERROR Error sending logs to consumer. provider=kafka.async.summary.multi_dest.batch, consumer=kafka.async.summary.multi_dest.batch.hdfs (org.apache.ranger.audit.queue.AuditFileSpool)

1 ACCEPTED SOLUTION

avatar
Super Guru

It seems that the data does not go to thrash. A simple restart of Kafka service addressed the issue. Kerberos was enabled recently and probably this service was not restarted.

The symlink suggestion from deepak is an interesting approach which while not applicable here, is worth it to remember for other situations.

View solution in original post

5 REPLIES 5

avatar

here looks like audit to hdfs is enabled for ranger kakfa plugin , and the audit to hdfs is failing for kafka

if you notice there is an error in bottom ,

Caused by: java.net.UnknownHostException: xyzlphdpd1... 22 more

as it is HA cluster , i think there is some issue with configuration of ranger hdfs audit in kafka ,

i remember seeing such issue in test connection at ranger side and symlinking to hdfs-site.xml in /etc/ranger/admin/conf had solved it , can you please try same with kafka conf , I mean symlink hdfs-site.xml in kafka conf

avatar
Super Guru

@deepak sharma.

Thanks. I am aware that audit to hdfs is enabled for ranger kakfa plugin, and the audit to hdfs is failing for kafka, that's how we extracted the exception, from Kafka logs. Let me check if symlink hdfs-site.xml in kafka conf does it. Stay tuned.

avatar

sure , let me know if it works

avatar
Super Guru

@deepak sharma

Crazy enough. I just reached to this customer and s simple restart of Kafka service addressed the issue. Kerberos was enabled recently and probably this service was not restarted. Not much to learn.

The symlink suggestion from you is an interesting approach which while not applicable here, is worth it to remember for other situations. Thank you for the suggestion.

avatar
Super Guru

It seems that the data does not go to thrash. A simple restart of Kafka service addressed the issue. Kerberos was enabled recently and probably this service was not restarted.

The symlink suggestion from deepak is an interesting approach which while not applicable here, is worth it to remember for other situations.