Posts: 19
Registered: ‎05-26-2016

Best practice for handling client timeout settings with HDFS API



I'm trying to get a firmer grasp of the issue of client connection timeouts when using the HDFS API programmatically. We have an application which intermittently gets a TimeoutException when writing content to file(s) in HDFS, using a set of worker threads.


The CDH Admin console shows 'good health' for HDFS, so this appears to be something intermittent.


We're currently not setting anything explicit into the Configuration object when obtaining a connection with HDFS.


Looking at


I'm wondering if what we want to look into is here


specifically ipc.client.connect.timeout and the like.


Any recommendations on what specifically we might want to set and how to perhaps increase the default settings to avoid these intermittent 'lags' of HDFS?