Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDP Services Port Collision

Highlighted

HDP Services Port Collision

New Contributor

When restarting our HDP cluster today, one of the NodeManager for Yarn was unable to start. After sleuthing to the node it turns out that it was a simple port collision that occurred. It turns out that the Ambari Metrics process HMaster process had grabbed port 45454 and this blocked the NodeManager from starting.

I'm not exactly sure how to reproduce this as we have restarted our cluster dozen's of times without incidence but it might be something straight forward to retry the NodeManager on a different port if it is in use.

Here are the steps I use to fix the problem:

  1. Determine what the issue was with the NodeManager. Here is the stacktrace:
    Problem binding to [0.0.0.0:45454] java.net.BindException: Address already in use; For more details see:  http://wiki.apache.org/hadoop/BindException
    org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: Problem binding to [0.0.0.0:45454] java.net.BindException: Address already in use; For more details see:  http://wiki.apache.org/hadoop/BindException
      at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
      at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
      at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStart(ContainerManagerImpl.java:414)
      at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
      at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:302)
      at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:547)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594)
      
  2. I then determined the process that was using the port: org.apache.hadoop.hbase.master.HMaster
  3. I then within Ambari stopped the the Ambari Metrics service.
  4. I then successfully started the NodeManager.
  5. I restarted the Ambari Metrics service.
  6. Happy cluster!

Here are the versions of the offending software we are using:

Stack HDP-2.5
Name: HDP-2.5.3.0
Version: 2.5.3.0-37
YARN: 2.7.3
Ambari Metrics: 0.1.0

Don't have an account?
Coming from Hortonworks? Activate your account here