- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
java.net.BindException: Cannot assign requested address
- Labels:
-
HDFS
Created ‎07-24-2018 03:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
We are seeing the below error for some job failures,
+++++++
INFO - java.net.BindException: Problem binding to [hostname/IP:0] java.net.BindException: Cannot assign requested address;
+++++++
As per apache wiki,
++++++++++
If the port is "0", then the OS is looking for any free port -so the port-in-use and port-below-1024 problems are highly unlikely to be the cause of the problem. Hostname confusion and network setup are the likely causes.
++++++++++
workflow job scheduler hostname is mentioned in the error above and this happens during an HDFS command execution step. Any idea why it is happening?
Created ‎07-24-2018 03:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
INFO - 2018-07-24 05:16:24,511 INFO [main] retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(148)) - Exception while invoking getFileInfo of class ClientNamenodeProtocolTranslatorPB over namenode_host/IP after 1 fail over attempts. Trying to fail over immediately.
INFO - java.net.BindException: Problem binding to [scheduler_hostname/IP:0] java.net.BindException: Cannot assign requested address; For more details see: http://wiki.apache.org/hadoop/BindException
INFO - at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
INFO - at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
INFO - at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
INFO - at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
INFO - at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:720)
INFO - at org.apache.hadoop.ipc.Client.call(Client.java:1476)
INFO - at org.apache.hadoop.ipc.Client.call(Client.java:1409)
INFO - at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
INFO - at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
INFO - at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
INFO - at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
INFO - at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
INFO - at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
INFO - at java.lang.reflect.Method.invoke(Method.java:606)
INFO - at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
INFO - at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
INFO - at com.sun.proxy.$Proxy17.getFileInfo(Unknown Source)
INFO - at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2123)
INFO - at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1253)
INFO - at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1249)
INFO - at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
INFO - at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1249)
INFO - at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1417)
INFO - Caused by: java.net.BindException: Cannot assign requested address
INFO - at sun.nio.ch.Net.connect0(Native Method)
INFO - at sun.nio.ch.Net.connect(Net.java:465)
INFO - at sun.nio.ch.Net.connect(Net.java:457)
INFO - at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670)
INFO - at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
INFO - at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
INFO - at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
INFO - at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615)
INFO - at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:714)
INFO - at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376)
INFO - at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525)
INFO - at org.apache.hadoop.ipc.Client.call(Client.java:1448)
INFO - ... 17 more
Created ‎07-29-2018 07:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A common reason is abuse of software clients (due to excessive connections being created without use of shared connection pools, or a leak of connections due to non-closure in the code), or lower level problems with the socket closure (such as the FIN stage of TCP not being correctly processed, causing the OS to hold the port open for an extended period of time waiting for the final close to complete).
Are you perhaps executing a lot of concurrent programs on your cluster, or use a multi-threaded app that builds a new network client (for HDFS, etc.) under each thread?
When you experience this, you could run an lsof check on the host of the failing task to find which PID(s) are occupying most of the network client ephemeral ports and if there is a pattern to their destination(s). This can help figure out where the problem specifically lies, and what category (in the above) it may belong to.
