Member since
04-26-2016
19
Posts
6
Kudos Received
0
Solutions
11-03-2017
11:49 AM
1 Kudo
I have a Java application that is appending recorded video into an HDFS file. Occasionally, after writing a batch of video frames, when I try to close the FSDataOutputStream I get the following error: Unable to close file because the last block does not have enough number of replicas In this case, I sleep for 100ms and try again and the close succeeds. However, the next time I try and open the file I get the following error: Failed to APPEND_FILE /PSG/20171102.idx for DFSClient_NONMAPREDUCE_1265824578_479 on 192.168.3.224 because DFSClient_NONMAPREDUCE_1265824578_479 is already the current lease holder. What is the proper way of handling a failed close attempt? Any ideas on how to handle such situations? Thanks, David
... View more
Labels:
- Labels:
-
Apache Hadoop
10-09-2017
04:11 PM
Yes, actually it's 20s. I believe that it is the default from the ipc.client.connect.timeout default. I am trying to see if I can set it to 2s. The main problem seems to be that for every FileSystem object I create it wants to always try my server 1 first which is down. It doesn't seem to remember that it was down the last time it tried and continues to keep retrying it for each new FileSystem object. I am also trying caching my own FileSystem object so that, if I reuse an object that has already failed over to the second server, I won't incur the same 2s delay in first trying to connect to the failed server.
... View more
10-06-2017
03:59 PM
dfs.client.retry.policy.enabled is set to false
... View more
10-06-2017
03:42 PM
I have a cluster with two namenodes configured in a for HA. For failover testing, we purposely turned off namenode 1. However, when trying to check an HDFS file size from server 2, the HDFS client call still attempts to connect to namenode 1 first. This causes a 20s delay while it times out before it tries namenode2. I've tried setting the dfs.ha.namenodes.xxx property to change the search order but without success. It always trys namenode1 first and then, after 20s goes to namenode 2. This is causing unacceptable delays in our system which needs faster response times than having to wait 20s to connect to the correct namenode. Does anyone how I may rectify this problem? Thanks, David
... View more
Labels:
- Labels:
-
Apache Hadoop