Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDP Cluster IP Configuration

HDP Cluster IP Configuration

Expert Contributor
HI All,

Currently we have a multiple node HA cluster with active namenode and standby secondary node. And we have abinitio configuration (separate node where ambari server is configured called AWS Edge node) for the data ingestion and read from HDFS. Namenodes and Data nodes are configured as ambari agents. We have a scenario like when there is a failover from active namenode to standby, abintio software is not able to run hadoop commands from edge node (where ambari server is running) and I believe it is not recognizing the fail over of the name node and the log also shows that it was trying to connect to previous active name node and it is failing. My first doubt is, ambari server node is not part of the hadoop cluster but from there I was able to run hadoop commands, only when failover occurs I was not able to run hadoop commands because of which abinitio job also getting failed. Below command fails because it refers to previous active name node.

[ambari@eim-edge-node-1]:/home/ambari $ hadoop fs -ls / ls: Call From EN1/xxx to xxxxx:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

Is there a way we can configure an ip on the hadoop cluster to switch between primary and secondary name nodes and act as a service ip for the cluster? If so, how do we do it.

Appreciate your help / comments.

Accessing Error after fail over: (removed the hostname as xxxx)

"2016-07-28 12:15:48,347 INFO  ipc.Client (Client.java:handleConnectionFailure(869)) - Retrying connect to server: <xxxxx>:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)2016-07-28 12:15:48,348 WARN  ha.HealthMonitor (HealthMonitor.java:doHealthChecks(211)) - Transport-level exception trying to monitor health of NameNode at <xxxxx>::8020: java.net.ConnectException: Connection refused Call From NN2/<xxxxx>: to <xxxxx>::8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused"
6 REPLIES 6

Re: HDP Cluster IP Configuration

@Muthukumar S

As per I know generally the namenode failover/HA works on nameservice concept and not the IP address.

[ambari@eim-edge-node-1]:/home/ambari $ hadoop fs -ls / ls: Call From EN1/xxx to xxxxx:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

Namenode should failover manually/automatically as per you configured. From above log it seems that primary is down. You should basically check the reason for other namenode not getting active. Please check ZKFC logs for more details.

I wont think you have option to configure both namenodes based on IP failover. Usually IP to hostname and vice-versa mapping is done internally in hadoop.

Re: HDP Cluster IP Configuration

Expert Contributor

@Sagar Shimpi

Thank you for your comment.

Ok, the thing is fail over is happening fine within the cluster. The setup is

Node 1 - Ambari server

NN1, NN2, DN1, DN2 ... etc - Ambari agents

NN1 - Active, NN2 - Standby

This scenario everything is perfect and able to run hadoop commands from Node 1 and from any node on the cluster.

When the fail over happens I,e - NN1 - standby and NN2 - Active, hadoop commands from Node 1 I,e where ambari server is running throws an error like above with the error as accessing NN1 (standby). But hadoop commands from NN & DN's works.

Anything need to be done so that ambari server node also aware of the failover and hadoop commands works there as well?

Appreciate your views / suggestions.

Re: HDP Cluster IP Configuration

@Muthukumar S

1. Can you please make sure the Node1 is managed by ambari and HDFS client is installed on that node.

2. PLease compare configs [core-site.xml and hdfs-site.xml] from working node and ambari node to check for any differences.

Re: HDP Cluster IP Configuration

Expert Contributor

@Sagar Shimpi

Node1 does not have any disks for hdfs and it is already been used only as ambari server, you mean i need to install ambari agent on this node and hdfs client as well eventhough its not going to serve space for hdfs? Is that advisable to run both ambari server and ambari agent on same node?

From the setup we have, it looks like it will work only for standalone cluster with only one name node and can be managed by separate node as ambari server. With HA it needs to be part of the cluster and also should have both ambari server & agent installed to serve the purpose?

Correct me if im wrong. It is on production so i need to make sure these will be the right steps?

Thank you.

Re: HDP Cluster IP Configuration

You got me wrong. I mean that, for accessing HDFS filesystem from any node you need to have config files in place [For Eg. core-site.xml, hdfs-site.xml]

If you install HDFS client on the ambari server node, then config files will be copied on that node and you will be able to access HDFS filesystem from that node. [Note: Installing HDFS client doesn't mean you are storing/saving HDFS data on this node.]

I suspect that in your case the ambari node is acting as hdfs client, but the config files [core-site.xml, hdfs-site.xml] on this node are not updated as compared to NN1,NN2 or DN's.

hence i asked you to crosscheck if there is any differences found.

Let me know if that makes clear.

Highlighted

Re: HDP Cluster IP Configuration

Expert Contributor

@Sagar Shimpi

It is been quite a long time to reply in this thread, sorry for that.

In first place, my hadoop commands are working from Edge node(where ambari server is running) when cluster is fine.

When failover happens from say nn1 to nn2, the command starts to fail trying to connect to previous active namenode.

As per your reply, hadoop commands should not work in the normal scenario as well. I was looking where the problem is and what changes i should make so that hadoop commands works fine from edge node even there is a fail over?

Let me know some solution. Thanks in Advance.