Created 08-31-2016 09:03 PM
For the issue described below: Does the data go to trash because the node is unavailable? What could cause this exception in the context of the recent cluster kerberos enabling?
Here is the issue an organization is facing with Kafka after recently enabling Kerberos in HDP 2.4.2 cluster.
They are trying to build a pipeline from Data Center to HDFS. The data is first being mirrored to the cluster using Mirror Maker .8 as the Data Center uses kafka .8. The data is then avro serialized using a Flume agent and dumped into HDFS through the Confluent HDFS connector.
However, from the MirrorMaker, they notice that only about half of the data is mirrored. Since Kerberos was enabled in their cluster, they are noticing the following error in the kafka logs:
[2016-08-29 16:51:28,479] INFO Returning HDFS Filesystem Config: Configuration: core-default.xml, core-site.xml, hdfs-default.xml, hdfs-site.xml (org.apache.ranger.audit.destination.HDFSAuditDestination)
[2016-08-29 16:51:28,496] ERROR Error writing to log file. (org.apache.ranger.audit.provider.BaseAuditHandler)
java.lang.IllegalArgumentException: java.net.UnknownHostException: xyzlphdpd1
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:406)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:311)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
at org.apache.ranger.audit.destination.HDFSAuditDestination.getLogFileStream(HDFSAuditDestination.java:221)
at org.apache.ranger.audit.destination.HDFSAuditDestination.logJSON(HDFSAuditDestination.java:123)
at org.apache.ranger.audit.queue.AuditFileSpool.sendEvent(AuditFileSpool.java:890)
at org.apache.ranger.audit.queue.AuditFileSpool.runDoAs(AuditFileSpool.java:838)
at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:759)
at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:757)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
at org.apache.ranger.audit.queue.AuditFileSpool.run(AuditFileSpool.java:765)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.UnknownHostException: xyzlphdpd1
... 22 more
[2016-08-29 16:51:28,496] ERROR Error sending logs to consumer. provider=kafka.async.summary.multi_dest.batch, consumer=kafka.async.summary.multi_dest.batch.hdfs (org.apache.ranger.audit.queue.AuditFileSpool)
Created 09-01-2016 06:29 PM
It seems that the data does not go to thrash. A simple restart of Kafka service addressed the issue. Kerberos was enabled recently and probably this service was not restarted.
The symlink suggestion from deepak is an interesting approach which while not applicable here, is worth it to remember for other situations.
Created 09-01-2016 05:07 PM
here looks like audit to hdfs is enabled for ranger kakfa plugin , and the audit to hdfs is failing for kafka
if you notice there is an error in bottom ,
Caused by: java.net.UnknownHostException: xyzlphdpd1... 22 more
as it is HA cluster , i think there is some issue with configuration of ranger hdfs audit in kafka ,
i remember seeing such issue in test connection at ranger side and symlinking to hdfs-site.xml in /etc/ranger/admin/conf had solved it , can you please try same with kafka conf , I mean symlink hdfs-site.xml in kafka conf
Created 09-01-2016 06:18 PM
Thanks. I am aware that audit to hdfs is enabled for ranger kakfa plugin, and the audit to hdfs is failing for kafka, that's how we extracted the exception, from Kafka logs. Let me check if symlink hdfs-site.xml in kafka conf does it. Stay tuned.
Created 09-01-2016 06:28 PM
sure , let me know if it works
Created 09-01-2016 06:32 PM
Crazy enough. I just reached to this customer and s simple restart of Kafka service addressed the issue. Kerberos was enabled recently and probably this service was not restarted. Not much to learn.
The symlink suggestion from you is an interesting approach which while not applicable here, is worth it to remember for other situations. Thank you for the suggestion.
Created 09-01-2016 06:29 PM
It seems that the data does not go to thrash. A simple restart of Kafka service addressed the issue. Kerberos was enabled recently and probably this service was not restarted.
The symlink suggestion from deepak is an interesting approach which while not applicable here, is worth it to remember for other situations.