Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

NIFI data from kafka to hdfs

New Contributor

I use getkafka--updataAttribute--Puthdfs .And the Conflict Resolution Strategy is append

However there is something wrong when data in hdfs reached about 20M.

The log tells me is a problem about data blocks.

I also used a MergeContent Processor before the PutHdfs ,but still got the problem.

I found that sometimes the data lose when got error ,not route to failure.

Any help is much appreciated!! Thank you.

2 REPLIES 2

Rising Star

@marson chu Can you post the log details? That would be helpful.

New Contributor

2017-03-22 19:31:59,706 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@7bc3c59f checkpointed with 14 Records and 0 Swap Files in 7 milliseconds (Stop-the-world time = 1 milliseconds, Clear Edit Logs time = 0 millis), max Transaction ID 55 2017-03-22 19:32:03,351 INFO [Thread-51849] org.apache.hadoop.hdfs.DFSClient Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2282) ~[hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1343) [hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1184) [hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454) [hadoop-hdfs-2.7.3.jar:na] 2017-03-22 19:32:03,352 WARN [Thread-51849] org.apache.hadoop.hdfs.DFSClient Error Recovery for block BP-1541383466-192.168.78.84-1489658920621:blk_1073772950_293619 in pipeline DatanodeInfoWithStorage[192.168.78.84:50010,DS-c9f30077-6122-48c1-bd02-9226498edacd,DISK], DatanodeInfoWithStorage[192.168.78.87:50010,DS-410c9e77-803d-43ad-ae83-b38d50842f96,DISK], DatanodeInfoWithStorage[192.168.78.86:50010,DS-5f1f2258-1e87-4f52-ac46-a0fece7c24bb,DISK]: bad datanode DatanodeInfoWithStorage[192.168.78.84:50010,DS-c9f30077-6122-48c1-bd02-9226498edacd,DISK] 2017-03-22 19:32:03,390 INFO [Thread-51849] org.apache.hadoop.hdfs.DFSClient Exception in createBlockOutputStream

java.io.IOException: Got error, status message , ack with firstBadLink as 192.168.78.85:50010 at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:142) ~[hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1359) [hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1184) [hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454) [hadoop-hdfs-2.7.3.jar:na] 2017-03-22 19:32:03,390 WARN [Thread-51849] org.apache.hadoop.hdfs.DFSClient Error Recovery for block BP-1541383466-192.168.78.84-1489658920621:blk_1073772950_293619 in pipeline DatanodeInfoWithStorage[192.168.78.87:50010,DS-410c9e77-803d-43ad-ae83-b38d50842f96,DISK], DatanodeInfoWithStorage[192.168.78.86:50010,DS-5f1f2258-1e87-4f52-ac46-a0fece7c24bb,DISK], DatanodeInfoWithStorage[192.168.78.85:50010,DS-f33972da-8d93-4edd-9c14-6a956973b7a2,DISK]: bad datanode DatanodeInfoWithStorage[192.168.78.85:50010,DS-f33972da-8d93-4edd-9c14-6a956973b7a2,DISK]

2017-03-22 19:32:03,392 WARN [Thread-51849] org.apache.hadoop.hdfs.DFSClient DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[192.168.78.87:50010,DS-410c9e77-803d-43ad-ae83-b38d50842f96,DISK], DatanodeInfoWithStorage[192.168.78.86:50010,DS-5f1f2258-1e87-4f52-ac46-a0fece7c24bb,DISK]], original=[DatanodeInfoWithStorage[192.168.78.87:50010,DS-410c9e77-803d-43ad-ae83-b38d50842f96,DISK], DatanodeInfoWithStorage[192.168.78.86:50010,DS-5f1f2258-1e87-4f52-ac46-a0fece7c24bb,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:925) ~[hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988) ~[hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156) ~[hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454) ~[hadoop-hdfs-2.7.3.jar:na] 2017-03-22 19:32:03,437 ERROR [Timer-Driven Process Thread-5] o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=f5aaac3a-015a-1000-4930-d89685499d91] Failed to write to HDFS due to org.apache.nifi.processor.exception.ProcessException: IOException thrown from PutHDFS[id=f5aaac3a-015a-1000-4930-d89685499d91]: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): failed to create file /user/root/pstest/hdfs-1/322 for DFSClient_NONMAPREDUCE_1319066147_88 for client 192.168.78.87 because current leaseholder is trying to recreate file.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.