we have ambari cluster with two thrift server
the first thrift server always fail on Address already in use on - master-node1 machine
we get the following error on - Thrift server ( the log under /var/log/spark2 )
19/03/08 08:42:59 ERROR ThriftCLIService: Error starting HiveServer2: could not start ThriftBinaryCLIService org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:10016. at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:109) at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:91) at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:87) at org.apache.hive.service.auth.HiveAuthFactory.getServerSocket(HiveAuthFactory.java:241) at org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:66) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.BindException: Address already in use (Bind failed) at java.net.PlainSocketImpl.socketBind(Native Method) at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387) at java.net.ServerSocket.bind(ServerSocket.java:375) at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:106) ... 5 more
The default port for the thrift is 10016
and we do netstat in order to find who use the port as the following
netstat -tulpn | grep 10016
we not get nothing , means no application using the port 10016
so we not understand how log say Address already in use , when no application using the port
example what we get on the good node ( master-node2 )
# netstat -tulpn | grep 10016 tcp6 0 0 :::10016 :::* LISTEN 26092/java # ps -ef | grep 26092 hive 26092 1 6 07:14 ? 00:01:34 /usr/jdk64/jdk1.8.0_112/bin/java -Dhdp.version=18.104.22.168-91 ........
On the problematic node where you are getting PortBindException ... can you please try running the following command to see if it is able to bind the port ?
Following command will start Netcat process and will try to bind port 10016
# nc -l 10016
Also please check the "/etc/hosts" file on your problematic host if it has multiple hostnames defined for the same host by any chance?
If yes then can you change it? Or change the spark thrift server setting to use specific address instead of "0.0.0.0" and then see if it works?
@Jay , for now we start the thrift and its up , but s it will down soon , its happens after ~1 hour or little more
for now we get
nc -l 10016 Ncat: bind to :::10016: Address already in use. QUITTING.
but after some time thrift goes down from the ambari , and then nc command not give above results
any way you said "Or change the spark thrift server setting to use specific address instead of "0.0.0.0" and then see if it works "
do you means to change the defauls port from 10016 to other as 10055 for example?
0.0.0.0 - isn't real address , or I miss you ?