Created 06-02-2016 09:56 PM
When I do "hdfs dfs -cat file1" from the command line, I got the exception saying that it Cannot obtain block length for LocatedBlock. How should we handle this case?
Created 06-02-2016 10:09 PM
Usually when you see "Cannot obtain block length for LocatedBlock", this means the file is still in being-written state, i.e., it has not been closed yet, and the reader cannot successfully identify its current length by communicating with corresponding DataNodes. There are multiple possibilities here, .e.g., there may be temporary network connection issue between the reader and the DataNodes, or the original writing failed some while ago and the under-construction replicas are somehow missing.
In general you run fsck command to get more information about the file. You can also trigger lease recovery for further debugging. Run command:
hdfs debug recoverLease -path <path-of-the-file> -retries <retry times>
This command will ask the NameNode to try to recover the lease for the file, and based on the NameNode log you may track to detailed DataNodes to understand the states of the replicas. The command may successfully close the file if there are still healthy replicas. Otherwise we can get more internal details about the file/block state.
Created 06-02-2016 10:04 PM
you can try running "hdfs fsck <path>/file1" to check if there is a corruption. Next step would be to try and recover the blocks or if not relevant, then remove them.
Created 06-02-2016 10:09 PM
Usually when you see "Cannot obtain block length for LocatedBlock", this means the file is still in being-written state, i.e., it has not been closed yet, and the reader cannot successfully identify its current length by communicating with corresponding DataNodes. There are multiple possibilities here, .e.g., there may be temporary network connection issue between the reader and the DataNodes, or the original writing failed some while ago and the under-construction replicas are somehow missing.
In general you run fsck command to get more information about the file. You can also trigger lease recovery for further debugging. Run command:
hdfs debug recoverLease -path <path-of-the-file> -retries <retry times>
This command will ask the NameNode to try to recover the lease for the file, and based on the NameNode log you may track to detailed DataNodes to understand the states of the replicas. The command may successfully close the file if there are still healthy replicas. Otherwise we can get more internal details about the file/block state.
Created 06-27-2020 07:24 AM
"Cannot obtain block length for LocatedBlock" error comes because of file is still in being-written state. Run the fsck command to get more information about the error file.
$ hdfs fsck -blocks /user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21/DN27.Apollo.1593219600494.txt.gz
Connecting to namenode \ugi=hdfs&blocks=1&path=%2Fuser%2Fbdas%2Fwarehouse%2Fqueue%2Finput%2Fapollo_log%2Fds%3D20200626%2Fhr%3D21%2FDN27.Apollo.1593219600494.txt.gz
FSCK started by hdfs (auth:KERBEROS_SSL) from /10.40.29.101 for path /user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21/DN27.Apollo.1593219600494.txt.gz at Sat Jun 27 08:33:46 EDT 2020
Status: HEALTHY
Total size: 0 B (Total open files size: 21599 B)
Total dirs: 0
Total files: 0
Total symlinks: 0 (Files currently being written: 1)
Total blocks (validated): 0 (Total open file blocks (not validated): 1)
We can run fsck for the full directory to check for all the file's status:
~]$ hdfs fsck /user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21/ -files -openforwrite
Connecting to namenode \ugi=hdfs&files=1&openforwrite=1&path=%2Fuser%2Fbdas%2Fwarehouse%2Fqueue%2Finput%2Fapollo_log%2Fds%3D20200626%2Fhr%3D21
FSCK started by hdfs (auth:KERBEROS_SSL) from /10.47.27.101 for path /user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21 at Sat Jun 27 08:47:32 EDT 2020
/user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21 <dir>
/user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21/DN27.Apollo.1593219600494.txt.gz 21599 bytes, 1 block(s), OPENFORWRITE: OK
/user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21/DN27.Apollo.1593220237244.txt.gz 20661944 bytes, 1 block(s): OK
/user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21/DN27.Apollo.1593220269292.txt.gz 20857646 bytes, 1 block(s): OK
In above output it is showing only 1-file is having issue. So run the below command to recover its lease.
$ hdfs debug recoverLease -path /user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21/DN27.Apollo.1593219600494.txt.gz -retries 3
Once succeeded, we will verfiy again with fsck command:
$ hdfs fsck /user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21/ -files -openforwrite
Connecting to namenode via ugi=hdfs&files=1&openforwrite=1&path=%2Fuser%2Fbdas%2Fwarehouse%2Fqueue%2Finput%2Fapollo_log%2Fds%3D20200626%2Fhr%3D21
FSCK started by hdfs (auth:KERBEROS_SSL) from /10.40.29.101 for path /user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21 at Sat Jun 27 08:49:09 EDT 2020
/user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21 <dir>
/user/bdas/warehouse/queue/input/apollo_log/ds=20200626/hr=21/DN27.Apollo.1593219600494.txt.gz 3528409 bytes, 1 block(s): OK
Now the error is resolved.
Created 09-06-2020 06:49 AM
thx for solution.
Job works fine
Created 07-06-2021 02:03 AM
Hi DarveshK,
do you know what to do if hdfs debug recoverLease returns
recoverLease returned false.
Giving up on recoverLease
Does this mean that the file is lost?
Thank you!
Created 07-06-2021 10:26 AM
@diplompils It is not necessary that file is lost if you are getting the output as false for recoverLease command. Usually file can't be deleted until it has lease acquired and not explicitly deleted using rm command.
You can try below-
hdfs debug recoverLease -path <file> -retries 10
Or you may check - https://issues.apache.org/jira/browse/HDFS-8576