Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Ambari Server lists too many hosts (caused by hostname / fqdn)

avatar
Expert Contributor

I just restarted my cluster (HDP 2.6) nodes for the first time. When I ran the

ambari-server start
ambari-agent start

command, the Ambari UI didn't find a heartbeat from any host anymore!

When I called

curl -i -H "X-Requested-By: ambari" -u admin:mypassword -X GET http://localhost:8080/api/v1/hosts

I get the list of my hosts, but it contains two entries for each node (one is the pure hostname, and the other one the FQDN):

HTTP/1.1 200 OK
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Cache-Control: no-store
Pragma: no-cache
Set-Cookie: AMBARISESSIONID=7pm102h29dbu1670zcr416aq9;Path=/;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
User: admin
Content-Type: text/plain
Vary: Accept-Encoding, User-Agent
Content-Length: 1017
Server: Jetty(8.1.19.v20160209)


{
  "href" : "http://localhost:8080/api/v1/hosts",
  "items" : [
    {
      "href" : "http://localhost:8080/api/v1/hosts/hdp-1",
      "Hosts" : {
        "cluster_name" : "TestCluster",
        "host_name" : "hdp-1"
      }
    },
    {
      "href" : "http://localhost:8080/api/v1/hosts/hdp-1.novalocal",
      "Hosts" : {
        "host_name" : "hdp-1.novalocal"
      }
    },
    {
      "href" : "http://localhost:8080/api/v1/hosts/hdp-2",
      "Hosts" : {
        "cluster_name" : "TestCluster",
        "host_name" : "hdp-2"
      }
    },
    {
      "href" : "http://localhost:8080/api/v1/hosts/hdp-2.novalocal",
      "Hosts" : {
        "host_name" : "hdp-2.novalocal"
      }
    },
    {
      "href" : "http://localhost:8080/api/v1/hosts/hdp-3",
      "Hosts" : {
        "cluster_name" : "TestCluster",
        "host_name" : "hdp-3"
      }
    },
    {
      "href" : "http://localhost:8080/api/v1/hosts/hdp-3.novalocal",
      "Hosts" : {
        "host_name" : "hdp-3.novalocal"
      }
    }
  ]
}

This seems to confuse the ambari-server / ambari-agent, that I can't receive a heartbeat anymore! How can I solve this issue, my cluster is not usable anymore, as the services miss the heartbeat! Thank you!

Update: I just saw, that I set the hostname to hdp-x before the ambari-server install, e.g.:

sudo hostname hdp-1

When I restart the node(s), it has its "old" hostname again:

hdp1.novalocal

I just tried to make another "sudo hostname hdp-1" again, but it didn't help, is it because the ambari-server and ambari-agents start automatically after boot? Stopping and restarting the services after this "hostname hdp-1" command didn't help!

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Daniel Müller

This happens if you have changed the hostname of your cluster nodes after the ambari cluster installation. In thsi case because initially suppose the hostname was "hdp-3.novalocal" for the host so after starting the agent on that host it will be registered in the ambri DB with name "hdp-3.novalocal", But after few days if you will change the agent hostname to "hdp-3" then a new host will eb registered to the ambari cluster (even thoug the host is same but the hostname was different earlier) The stop the Ambari Server

# ambari-server stop

Please take ambari DB Dump.

# pg_dump -U ambari ambari > /tml/ambari_bkp.sql

.

We cleaned unwanted hosts from DB and delete those selected hosts. Get their "host_id" of those hosts which you want to clean. Connect to ambari DB.

# psql -U ambari ambari
Password: bigdata

Queries: To find the host_id;

select host_id from hosts where host_name='hdp-1.novalocal';
select host_id from hosts where host_name='hdp-2.novalocal';
select host_id from hosts where host_name='hdp-3.novalocal';

.

Using the above command we will get the "host_id". As we know that the above hosts need to be deleted, So delete them as following. Suppose the host_id is respectively 111,222,333

delete from execution_command where task_id in (select task_id from host_role_command where host_id in (111,222,333));
delete from host_version where host_id in (111,222,333);
delete from host_role_command where host_id in (111,222,333);
delete from serviceconfighosts where host_id in (111,222,333);
delete from hoststate where host_id in (111,222,333);

delete from hosts where host_name in ('hdp-1.novalocal');
delete from hosts where host_name in ('hdp-2.novalocal');
delete from hosts where host_name in ('hdp-3.novalocal');

delete from alert_current where history_id in ( select alert_id from alert_history where host_name in ('hdp-1.novalocal'));
delete from alert_current where history_id in ( select alert_id from alert_history where host_name in ('hdp-2.novalocal'));
delete from alert_current where history_id in ( select alert_id from alert_history where host_name in ('hdp-3.novalocal'));

The restart Ambari Server

# ambari-server start

.

NOTE: Regarding the changing hostname issue I have written an article some time back, You should refer to the following article which explains why does it happen in Cloud environment and how to fix it.

https://community.hortonworks.com/articles/42872/why-ambari-host-might-have-different-public-host-n....

.

View solution in original post

5 REPLIES 5

avatar
Master Mentor

@Daniel Müller

This happens if you have changed the hostname of your cluster nodes after the ambari cluster installation. In thsi case because initially suppose the hostname was "hdp-3.novalocal" for the host so after starting the agent on that host it will be registered in the ambri DB with name "hdp-3.novalocal", But after few days if you will change the agent hostname to "hdp-3" then a new host will eb registered to the ambari cluster (even thoug the host is same but the hostname was different earlier) The stop the Ambari Server

# ambari-server stop

Please take ambari DB Dump.

# pg_dump -U ambari ambari > /tml/ambari_bkp.sql

.

We cleaned unwanted hosts from DB and delete those selected hosts. Get their "host_id" of those hosts which you want to clean. Connect to ambari DB.

# psql -U ambari ambari
Password: bigdata

Queries: To find the host_id;

select host_id from hosts where host_name='hdp-1.novalocal';
select host_id from hosts where host_name='hdp-2.novalocal';
select host_id from hosts where host_name='hdp-3.novalocal';

.

Using the above command we will get the "host_id". As we know that the above hosts need to be deleted, So delete them as following. Suppose the host_id is respectively 111,222,333

delete from execution_command where task_id in (select task_id from host_role_command where host_id in (111,222,333));
delete from host_version where host_id in (111,222,333);
delete from host_role_command where host_id in (111,222,333);
delete from serviceconfighosts where host_id in (111,222,333);
delete from hoststate where host_id in (111,222,333);

delete from hosts where host_name in ('hdp-1.novalocal');
delete from hosts where host_name in ('hdp-2.novalocal');
delete from hosts where host_name in ('hdp-3.novalocal');

delete from alert_current where history_id in ( select alert_id from alert_history where host_name in ('hdp-1.novalocal'));
delete from alert_current where history_id in ( select alert_id from alert_history where host_name in ('hdp-2.novalocal'));
delete from alert_current where history_id in ( select alert_id from alert_history where host_name in ('hdp-3.novalocal'));

The restart Ambari Server

# ambari-server start

.

NOTE: Regarding the changing hostname issue I have written an article some time back, You should refer to the following article which explains why does it happen in Cloud environment and how to fix it.

https://community.hortonworks.com/articles/42872/why-ambari-host-might-have-different-public-host-n....

.

avatar
Expert Contributor

Thank you @Jay SenSharma for the very fast answer. I just found out, that the wrong hostnames come from the ambari-agents! I did a

curl ... DELETE .../hosts/hdp-1.novalocal

and everything looked good afterwards (stopped the agents). When I restarted the agents the FQDNs were there in the GET request again! Can I still use your solution for that or is it another problem?

Another question: Is there a possibility to set the hostname, that the agents take? Or which information do they use? (file /etc/hostname / hostname / hostname -f)? As I'm not the Linux expert I'm thankful for each help.

avatar
Master Mentor

@Daniel Müller

Yes, Deleting unwanted hosts using Ambari API is also correct option. The DB queries just cleans all the unwanted information's of old hostnames from the DB completely. But both options are valid/good.

.

Regarding changing the agent host name permanently.

** Permanently fix the public hostname: (Recommended) 1. Create a file with name : "/var/lib/ambari-agent/public_hostname.sh" then in that file add the following line:

#!/bin/sh
  echo `hostname -f`

2. Make sure that the file "/var/lib/ambari-agent/public_hostname.sh" has proper execute permission. Example:

  chmod 755 "/var/lib/ambari-agent/public_hostname.sh"

3. On every ambari-agent host edit the file "/etc/ambari-agent/conf/ambari-agent.ini" and in the [agent] section add the following line:

  ## Added following to customize the public hostname
  public_hostname_script=/var/lib/ambari-agent/public_hostname.sh

NOTE: Users can also use the property "hostname_script" to customize the internal hostname. 3. Make sure that the changes are pushed to all the hosts present in the ambari cluster. 4. Now restart the agents.

  ambari-agent restart

.

avatar
Master Mentor

@Daniel Müller

For more informations please refer to: https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.0.0/bk_ambari-reference/content/how_to_customiz...

hostname_script=/var/lib/ambari-agent/hostname.sh
public_hostname_script=/var/lib/ambari-agent/public_hostname.sh

.

avatar
Expert Contributor

Removing the "hdp-1.novalocal" from the hosts list and using the hostname script for setting the public / private hostname did it for me! Thank you so much, I think you saved my whole week!