Created 11-17-2015 11:37 PM
ipc.server.tcpnodelay has been changed to true by default in hadoop 2.6. We are on hadoop 2.4 and would like to change it to true. What services if any require a restart for this change? Can it be set at job level for all jobs and not restart services?
With a big cluster where NN restart takes more than 60 minutes, we would like to avoid all possible restarts.
Created 11-18-2015 06:27 PM
ipc.server.tcpnodelay controls use of Nagle's algorithm on any server component that makes use of Hadoop's common RPC framework. That means that full deployment of a change in this setting would require a restart of any component that uses that common RPC framework. That's a broad set of components, including all HDFS, YARN and MapReduce daemons. It probably also includes other components in the wider ecosystem.
Created 11-17-2015 11:51 PM
HDFS restart is required after the change for the new config to take effect permanently. For changing the params as part of the job, this should work -
export HADOOP_OPTS="-Dipc.server.tcpnodelay=true"
Created 11-18-2015 12:44 AM
Thanks. Will this change help with all TCP/IP communication? Or will it only help with certain communication like mapreduce shuffle ?
Created 11-18-2015 01:42 PM
@Ravi This value will only impact the processes started with this specific option and the processes that are started after these values are set. So for e.g. - If the namenode is already running, it is probably using the existing value and that wont change for the running process.
Created 11-18-2015 01:49 AM
Please see this jira https://issues.apache.org/jira/browse/HADOOP-8069
and I believe this explains it
<name>ipc.server.tcpnodelay</name> <value>true</value> <description>Turn on/off Nagle's algorithm for the TCP socket connection on the server. Setting to true disables the algorithm and may decrease latency with a cost of more/smaller packets.
Created 11-18-2015 06:27 PM
ipc.server.tcpnodelay controls use of Nagle's algorithm on any server component that makes use of Hadoop's common RPC framework. That means that full deployment of a change in this setting would require a restart of any component that uses that common RPC framework. That's a broad set of components, including all HDFS, YARN and MapReduce daemons. It probably also includes other components in the wider ecosystem.