Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

GetHDFS Missing block exception

Highlighted

GetHDFS Missing block exception

Super Collaborator

Hi,

I am trying to connect to our HADOOP cluster from a new HDF Server outside of our HADOOP cluster.i am trying to do some simple tests before i move all my flows onto the server from old server.

I am experiencing issues when i try to access file using getHDFS..i copied the config files onto HDF server. both servers are using same kerberos KDC , so i am using same keytabs. here is the error message from applog.

i was told all the ports for HDFS,HIVE etc are open for communication between the 2 servers.do i need to change anything in the config files.?

2017-04-03 13:14:36,648 ERROR [Timer-Driven Process Thread-4] o.apache.nifi.processors.hadoop.GetHDFS org.apache.nifi.processor.exception.FlowFileAccessException: Failed to import data from org.apache.hadoop.hdfs.client.HdfsDataInputStream@389fa2c0 for StandardFlowFileRecord[uuid=ee9aa74d-21eb-45cd-b6d5-6440c1a95093,claim=,offset=0,name=7412824162032320,size=0] due to org.apache.nifi.processor.exception.FlowFileAccessException: Unable to create ContentClaim due to org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-306710789-172.16.3.5-1445707884245:blk_1075144155_1404007 file=/user/putarapasa/OCA_Nestac_XRef_Old.xlsx at org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2690) ~[na:na] at org.apache.nifi.processors.hadoop.GetHDFS.processBatchOfFiles(GetHDFS.java:369) [nifi-hdfs-processors-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.processors.hadoop.GetHDFS.onTrigger(GetHDFS.java:315) [nifi-hdfs-processors-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) [nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_112] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_112] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_112] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_112] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_112] Caused by:

org.apache.nifi.processor.exception.FlowFileAccessException: Unable to create ContentClaim due to org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-306710789-172.16.3.5-1445707884245:blk_1075144155_1404007 file=/user/putarapasa/OCA_Nestac_XRef_Old.xlsx at org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2683) ~[na:na] ... 14 common frames omitted Caused by:

org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-306710789-172.16.3.5-1445707884245:blk_1075144155_1404007 file=/user/putarapasa/OCA_Nestac_XRef_Old.xlsx at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:984) ~[hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:642) ~[hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:882) ~[hadoop-hdfs-2.7.3.jar:na] at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934) ~[hadoop-hdfs-2.7.3.jar:na] at java.io.DataInputStream.read(DataInputStream.java:100) ~[na:1.8.0_112] at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35) ~[nifi-utils-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:700) ~[na:na] at org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2680) ~[na:na]

4 REPLIES 4
Highlighted

Re: GetHDFS Missing block exception

Taking HDF out of the picture for a second, are you able retrieve the file (/user/putarapasa/OCA_Nestac_XRef_Old.xlsx) from HDFS using the command line?

Highlighted

Re: GetHDFS Missing block exception

Super Collaborator

@Bryan Bende

I didn't try from command line , but it is accessible from Hue. will try from command-line.

Highlighted

Re: GetHDFS Missing block exception

Ok, the best test would be to see if you could setup the HDFS client on one of the HDF servers and then retrieve it using the command line there. If that doesn't work it would narrow down the problem to something outside of NiFi, if that works then we need to think more why NiFi can't retrieve it :)

Highlighted

Re: GetHDFS Missing block exception

Super Collaborator

Looks like only the Name nodes are opened for connectivity to\from HDF server , the data nodes are not . We are trying to fix and will test after that.

Don't have an account?
Coming from Hortonworks? Activate your account here