Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

call from edgenode to xxxx.com:8032 failed on connection exception: java.net.ConnectionException: Connection refused

call from edgenode to xxxx.com:8032 failed on connection exception: java.net.ConnectionException: Connection refused

New Contributor

Hi All,

 

While running a mapreduce job, I am getting the following exception. Kindly help..

 

EBUG   on 22 Aug 2019 ,06:36:54 com.xxx.xxx.xxx.logger.XMLRPCLogger.log(XMLRPCLogger.java:76) => java.lang.Thread.run(Thread.java:745) =>  :  19/08/22 06:36:54 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm265

 

DEBUG   on 22 Aug 2019 ,06:36:54 com.xxx.xxx.xxx.logger.XMLRPCLogger.log(XMLRPCLogger.java:76) => java.lang.Thread.run(Thread.java:745) =>  :  19/08/22 06:36:54 INFO retry.RetryInvocationHandler: Exception while invoking getApplicationReport of class ApplicationClientProtocolPBClientImpl over rm265 after 1185 fail over attempts. Trying to fail over after sleeping for 1903ms.

 

DEBUG   on 22 Aug 2019 ,06:58:43 com.xxx.xxxx.archive.logger.XMLRPCLogger.log(XMLRPCLogger.java:76) => java.lang.Thread.run(Thread.java:745) =>  :  java.net.ConnectException: Call From abcd/<ipabcd> to pqrs:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

  

DEBUG   on 22 Aug 2019 ,07:01:28 xxx.xxx.xxx.archive.logger.XMLRPCLogger.log(XMLRPCLogger.java:76) => java.lang.Thread.run(Thread.java:745) =>  :  java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "abcd"; destination host is: "xyz":8032;

 

 

For your information please see below details and see error logs.

abcd – is our edge node

 

We have high availability cluster:

pqrs – name node 1

xyz – name node 2

4 REPLIES 4

Re: call from edgenode to xxxx.com:8032 failed on connection exception: java.net.ConnectionException: Connection refused

Mentor

@vivek_b2 

Is the cluster running fine? If so has the /etc/hosts on the edge node have entries for you namenodes? Can it resolve the IP's of name node 1 and name node 2.

Your issue looks a connectivity issue. I would usually start with the usual culprits FW, DNS and host entry etc

HTH

Re: call from edgenode to xxxx.com:8032 failed on connection exception: java.net.ConnectionException: Connection refused

New Contributor

@Shelton 

Dear Shelton,

The cluster is running fine.. /etc/hosts on the edge node does not have entries for namenodes.

I will check whether the ip's are resolving.. Is it because of the connectivity issue that I am getting GSS Exception also....??

Re: call from edgenode to xxxx.com:8032 failed on connection exception: java.net.ConnectionException: Connection refused

Super Mentor

@vivek_b2 

As we see this error:

Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "abcd"; destination host is: "xyz":8032;


Which can happen due to few reasons like incorrect FQDN/IP Mapping OR if we do not get a valid kerberos ticket due to some reasons. So it will be good to check few things/ Can you please share the following outputs:

1. On the Edge Node do you have the ambari-agent running? Can you please check and share the output of the following commands:

# hostname -f
# hostname

If your cluster nodes are resolving each other based on the "/etc/hosts" file entry (unlike DNS Server entry) then usually we should see the same /etc/hosts file mapping accross the cluster nodes. So pelase validate the same.

# cat /etc/hosts

Also from the Edge node are you able to access the scheduler port?

# telnet xyz 8032
(OR)
# nc -v xyz 8032

2. From the KDC and Ambari Server host you are able to resolve your Edge node using the FQDN correctly? Assuming your Edge node FQDN is "abcd" then from Ambari Server/KDC are you able to resolve it as following?

# ping abcd

 

3. Do you see keytabs inside the "/etc/security/keytabs" directory? Are you able to get a valid kerberos ticket using keytab?

Example:

# klist -ket /etc/security/keytabs/nm.service.keytab
Keytab name: FILE:/etc/security/keytabs/nm.service.keytab
KVNO Timestamp Principal
---- ------------------- ------------------------------------------------------
2 08/11/2019 01:58:29 nm/ker1latest4.example.com@EXAMPLE.COM (des-cbc-md5) 
2 08/11/2019 01:58:29 nm/ker1latest4.example.com@EXAMPLE.COM (aes256-cts-hmac-sha1-96) 
2 08/11/2019 01:58:29 nm/ker1latest4.example.com@EXAMPLE.COM (des3-cbc-sha1) 
2 08/11/2019 01:58:29 nm/ker1latest4.example.com@EXAMPLE.COM (arcfour-hmac) 
2 08/11/2019 01:58:29 nm/ker1latest4.example.com@EXAMPLE.COM (aes128-cts-hmac-sha1-96)


# kinit -kt /etc/security/keytabs/nm.service.keytab nm/ker1latest4.example.com@EXAMPLE.COM
# klist


Like are you able to do Kinit and can check if you are able to get valid tickets using "klist" ? In your case the Principal name might be different based on your setup of edge node.

.

.

If you find a reply useful, say thanks by clicking on the thumbs up button.

Highlighted

Re: call from edgenode to xxxx.com:8032 failed on connection exception: java.net.ConnectionException: Connection refused

Mentor

@vivek_b2 

The "on connection exception: java.net.ConnectionException: Connection refused"  is a network issue and the GSS Exception is a Kerberos one. 

Those are 2 different things if you are running the job from the edge node,  which user is executing the job?  Assuming is a user Dev1 on the edge node can validate this user has a valid Kerberos ticket

Dev1@localhost $ klist

 

Share the output of the above snippet !!
As you are executing the code from the edge node can you verify that the krb5.conf  file on the edge node is identical to the one on the KDC server, this file should be exactly the same on both the edge node and KDC server.
Was the Kerberos client installed on the edge node see below command

yum install krb5-workstation

The user running the job should kinit with his keytab to be able to grab a valid ticket. I recently created a document to answer a similar Kerberos issue on the edge node.

Please check my procedure for creating a user keytab on the edge node to enable him/her excute jobs in a kerberized cluster
https://community.cloudera.com/t5/Support-Questions/HDFS-is-not-accessible-from-an-user-after-kerber...

Hope that help