Problem: HDFS client does not switch immediately to backup Namenode when the active Namenode goes down. It does after 15 minutes eventhough the backup Namenode becomes active immediately.
In case of just active Namenode process goes down not the server, HDFS client switches immediately.
The solution we gave:
Solution: ipc.client.ping parameter in core-site.xml is true by default.
When the HDFS client established connection to Namenode and the server goes down, client will be trying to read bytes from the socket it has established with the server leading to SocketTimeout exception.
socket timeout is set to 1 minute, so after 1 minute we will get socket timeout exception
HDFS client does ping to server when ipc.client.ping set to true without waiting for ping reply and again tries to read from socket , again SocketTimout exception occurs, again ping....
And since this client will be idle, stopping client happens after 15 minutes, Then HDFS client gets a new connection through HA to the new active Namenode.
So we set this parameter ipc.client.ping to false, so HDFS client switches to new Namenode immediately without doing ping.
Is Changing ipc.client.ping will have any other impacts.