Support Questions

Find answers, ask questions, and share your expertise

PutHDFS processor fails to write to kerberised and TLS/SSL enabled HDFS

avatar
Explorer

Hello,

 

I get the below error message in my NiFi logs when I tried to write a file to HDFS, but when i try to write a file with in the hadoop cluster it works fine, but from NiFi it fails with the below message.

My Nifi service is started by root and, when I have a local NiFi instance i am able to write the file to HDFS, where as from the NiFi cluster am unable to do so, any help would be highly appreciated, I am trying another solution, if that works then I will post it over here.

 

 

2020-08-10 13:41:59,519 INFO [NiFi Web Server-32056] o.a.n.c.s.StandardProcessScheduler Starting LogMessage[id=b4bc6d2c-0173-1000-0000-00002905a41b]
2020-08-10 13:41:59,519 INFO [NiFi Web Server-32056] o.a.n.controller.StandardProcessorNode Starting LogMessage[id=b4bc6d2c-0173-1000-0000-00002905a41b]
2020-08-10 13:41:59,519 INFO [NiFi Web Server-32056] o.a.n.c.s.StandardProcessScheduler Starting LogMessage[id=b4bd264b-0173-1000-0000-000018f91304]
2020-08-10 13:41:59,519 INFO [NiFi Web Server-32056] o.a.n.controller.StandardProcessorNode Starting LogMessage[id=b4bd264b-0173-1000-0000-000018f91304]
2020-08-10 13:41:59,519 INFO [NiFi Web Server-32056] o.a.n.c.s.StandardProcessScheduler Starting GetFile[id=b4d14ae8-0173-1000-ffff-ffffe680a6a0]
2020-08-10 13:41:59,519 INFO [NiFi Web Server-32056] o.a.n.controller.StandardProcessorNode Starting GetFile[id=b4d14ae8-0173-1000-ffff-ffffe680a6a0]
2020-08-10 13:41:59,519 INFO [NiFi Web Server-32056] o.a.n.c.s.StandardProcessScheduler Starting PutHDFS[id=4d34342b-2901-125d-917f-567e466964c8]
2020-08-10 13:41:59,519 INFO [NiFi Web Server-32056] o.a.n.controller.StandardProcessorNode Starting PutHDFS[id=4d34342b-2901-125d-917f-567e466964c8]
2020-08-10 13:41:59,519 INFO [Timer-Driven Process Thread-6] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled GetFile[id=b4d14ae8-0173-1000-ffff-ffffe680a6a0] to run with 1 threads
2020-08-10 13:41:59,519 INFO [Timer-Driven Process Thread-2] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled LogMessage[id=b4bc6d2c-0173-1000-0000-00002905a41b] to run with 1 threads
2020-08-10 13:41:59,519 INFO [Timer-Driven Process Thread-5] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled LogMessage[id=b4bd264b-0173-1000-0000-000018f91304] to run with 1 threads
2020-08-10 13:41:59,543 INFO [Timer-Driven Process Thread-10] o.a.hadoop.security.UserGroupInformation Login successful for user abc@UX.xyzCORP.NET using keytab file /home/abc/confFiles/abc.keytab
2020-08-10 13:41:59,544 INFO [Timer-Driven Process Thread-10] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutHDFS[id=4d34342b-2901-125d-917f-567e466964c8] to run with 1 threads
2020-08-10 13:41:59,595 INFO [Thread-9481] o.a.h.h.p.d.sasl.SaslDataTransferClient SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-10 13:41:59,599 INFO [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Exception in createBlockOutputStream blk_1075334640_1594409
java.io.EOFException: null
        at java.io.DataInputStream.readByte(DataInputStream.java:267)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
        at org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
        at org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
        at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:203)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:193)
        at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1731)
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1679)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
2020-08-10 13:41:59,599 WARN [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Abandoning BP-1824237254-0.00.64.55-1545405130172:blk_1075334640_1594409
2020-08-10 13:41:59,601 WARN [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Excluding datanode DatanodeInfoWithStorage[0.00.64.57:50010,DS-d6f56418-6e18-4317-a8ec-4a5b15757728,DISK]
2020-08-10 13:41:59,605 INFO [Thread-9481] o.a.h.h.p.d.sasl.SaslDataTransferClient SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-10 13:41:59,606 INFO [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Exception in createBlockOutputStream blk_1075334641_1594410
java.io.EOFException: null
        at java.io.DataInputStream.readByte(DataInputStream.java:267)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
        at org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
        at org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
        at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:203)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:193)
        at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1731)
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1679)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
2020-08-10 13:41:59,606 WARN [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Abandoning BP-1824237254-0.00.64.55-1545405130172:blk_1075334641_1594410
2020-08-10 13:41:59,608 WARN [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Excluding datanode DatanodeInfoWithStorage[0.00.64.56:50010,DS-286b28e8-d035-4b8c-a2dd-aabb08666234,DISK]
2020-08-10 13:41:59,612 INFO [Thread-9481] o.a.h.h.p.d.sasl.SaslDataTransferClient SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-10 13:41:59,612 INFO [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Exception in createBlockOutputStream blk_1075334642_1594411
java.io.EOFException: null
        at java.io.DataInputStream.readByte(DataInputStream.java:267)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
        at org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
        at org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
        at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:203)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:193)
        at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1731)
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1679)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
2020-08-10 13:41:59,612 WARN [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Abandoning BP-1824237254-0.00.64.55-1545405130172:blk_1075334642_1594411
2020-08-10 13:41:59,614 WARN [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Excluding datanode DatanodeInfoWithStorage[0.00.64.58:50010,DS-53536364-33f4-40d6-85c2-508abf7ff023,DISK]
2020-08-10 13:41:59,618 INFO [Thread-9481] o.a.h.h.p.d.sasl.SaslDataTransferClient SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-10 13:41:59,619 INFO [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Exception in createBlockOutputStream blk_1075334643_1594412
java.io.EOFException: null
        at java.io.DataInputStream.readByte(DataInputStream.java:267)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
        at org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
        at org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
        at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:203)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:193)
        at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1731)
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1679)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
2020-08-10 13:41:59,619 WARN [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Abandoning BP-1824237254-0.00.64.55-1545405130172:blk_1075334643_1594412
2020-08-10 13:41:59,621 WARN [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Excluding datanode DatanodeInfoWithStorage[0.00.64.84:50010,DS-abba7d97-925a-4299-af86-b58fef9aaa12,DISK]
2020-08-10 13:41:59,621 WARN [Thread-9481] org.apache.hadoop.hdfs.DataStreamer DataStreamer Exception
java.io.IOException: Unable to create new block.
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1694)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
2020-08-10 13:41:59,621 WARN [Thread-9481] org.apache.hadoop.hdfs.DataStreamer Could not get block locations. Source file "/user/abc/puthdfs_test/.test.txt" - Aborting...block==null
2020-08-10 13:41:59,626 ERROR [Timer-Driven Process Thread-2] o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=4d34342b-2901-125d-917f-567e466964c8] Failed to write to HDFS due to org.apache.nifi.processor.exception.ProcessException: IOException thrown from PutHDFS[id=4d34342b-2901-125d-917f-567e466964c8]: java.io.IOException: Could not get block locations. Source file "/user/abc/puthdfs_test/.test.txt" - Aborting...block==null: org.apache.nifi.processor.exception.ProcessException: IOException thrown from PutHDFS[id=4d34342b-2901-125d-917f-567e466964c8]: java.io.IOException: Could not get block locations. Source file "/user/abc/puthdfs_test/.test.txt" - Aborting...block==null
org.apache.nifi.processor.exception.ProcessException: IOException thrown from PutHDFS[id=4d34342b-2901-125d-917f-567e466964c8]: java.io.IOException: Could not get block locations. Source file "/user/abc/puthdfs_test/.test.txt" - Aborting...block==null
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2347)
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2292)
        at org.apache.nifi.processors.hadoop.PutHDFS$1.run(PutHDFS.java:320)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
        at org.apache.nifi.processors.hadoop.PutHDFS.onTrigger(PutHDFS.java:250)
        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1176)
        at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
        at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Could not get block locations. Source file "/user/abc/puthdfs_test/.test.txt" - Aborting...block==null

 

  

1 ACCEPTED SOLUTION

avatar
Explorer

Hi,

NiFi v1.11.4 is running on Hadoop client  v3.2.1. There is a known issue of EOF Exception when connecting to Cloudera based on Hadoop v2.x. - https://issues.apache.org/jira/browse/HDFS-15191

Reverting NiFi to v1.11.3 which is based on Hadoop client v3.2.0 will not have this issue.

NiFi latest v1.12.0 is also running on Hadoop client  v3.2.1. so it has the same issue as NiFi v1.11.4

 

View solution in original post

22 REPLIES 22

avatar
Explorer

Go for NiFi v1.11.3 for now.

NiFi v1.12.0 has the same issue.

avatar
Explorer

Hi @Debangshu 

 

It worked with 1.10.0 and 1.11.3, thanks mate for the resolution.

 

Thanks

David

avatar
Explorer

hi,

 

If "root" is trying to write the file on HDFS then privilege issue should throw up as HDFS ACLs sync with sentry privileges.

 

Thanks

David