Support Questions
Find answers, ask questions, and share your expertise

NiFi - PutHDFS - DataStreamer Could not get block locations

Highlighted

NiFi - PutHDFS - DataStreamer Could not get block locations

New Contributor

Hi all,

 

I am unable to connect a NiFi cluster to a kerberized Hadoop cluster.

Kerberos connections via keytabs work well, however PutHDFS cannot place files on the storage.

 

  • Hadoop 2.6.0-cdh5.16.2
  • NiFi 1.12.1

Permissions are set correctly as I am able to access this HDFS path manually. PutHDFS does not crash during authentication neither. Logs are below. Is there any fix for this?

 

Exception java.io.IOException: Unable to create new block.

java.io.IOException: Could not get block locations. Source file "/path/in/hdfs" - aborting...block==null

 

 

2021-01-19 17:40:02,735 INFO [Thread-2023] org.apache.hadoop.hdfs.DataStreamer Exception in createBlockOutputStream blk_2498633572_1425307774
java.io.EOFException: null
        at java.io.DataInputStream.readByte(DataInputStream.java:267)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
        at org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
        at org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
        at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:203)
        at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:193)
        at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1731)
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1679)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
2021-01-19 17:40:02,735 WARN [Thread-2023] org.apache.hadoop.hdfs.DataStreamer Abandoning BP-X:blk_2498633572_1425307774
2021-01-19 17:40:02,946 WARN [Thread-2023] org.apache.hadoop.hdfs.DataStreamer Excluding datanode DatanodeInfoWithStorage[X:50010,DS-6fcb90be-f108-4f35-bf1d-ac69f7bdc5f0,DISK]
2021-01-19 17:40:02,947 WARN [Thread-2023] org.apache.hadoop.hdfs.DataStreamer DataStreamer Exception
java.io.IOException: Unable to create new block.
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1694)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
2021-01-19 17:40:02,947 WARN [Thread-2023] org.apache.hadoop.hdfs.DataStreamer Could not get block locations. Source file "/path/in/hdfs" - Aborting...block==null
2021-01-19 17:40:03,159 ERROR [Timer-Driven Process Thread-2] o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=5bcc38be-a9b5-1ee2-943f-fa891ce9efdc] Failed to write to HDFS due to org.apache.nifi.processor.exception.ProcessException: IOException thrown from PutHD
FS[id=5bcc38be-a9b5-1ee2-943f-fa891ce9efdc]: java.io.IOException: Could not get block locations. Source file "/path/in/hdfs" - Aborting...block==null: org.apache.nifi.processor.exception.ProcessException: IOE
xception thrown from PutHDFS[id=5bcc38be-a9b5-1ee2-943f-fa891ce9efdc]: java.io.IOException: Could not get block locations. Source file "/path/in/hdfs" - Aborting...block==null
org.apache.nifi.processor.exception.ProcessException: IOException thrown from PutHDFS[id=5bcc38be-a9b5-1ee2-943f-fa891ce9efdc]: java.io.IOException: Could not get block locations. Source file "/path/in/hdfs"
- Aborting...block==null
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2347)
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2292)
        at org.apache.nifi.processors.hadoop.PutHDFS$1.run(PutHDFS.java:320)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
        at org.apache.nifi.processors.hadoop.PutHDFS.onTrigger(PutHDFS.java:250)
        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1174)
        at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
        at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Could not get block locations. Source file "/path/in/hdfs" - Aborting...block==null
        at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1477)
        at org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667)