Created 11-24-2017 12:42 PM
hi,
we create by mistake worker node ( worker11 with datanode & namenode ) , while cluster already have worker11
only when we start the new worker11 we notice that other node - worker11 already exsist
so the problem is that we erased the OS of worker11 ( without delete it from the cluster ) , and new worker - worker11 stay in the cluster
after couple hour we notice that heartbeat loses on the new worker11 and after some time worker11 crash , so we need to start this machine ( boot ) again and so on
it is clear that all the problem on worker11 are because this is duplicate machine , ( while the old worker11 machine was removed from the VM center ( OS )
so what is the workaround to do on worker11 node ? , in ordeer amabri cluster will get this machine without problems?
Created 11-24-2017 12:50 PM
Usually "ambari-agent" starts sending the heartbeat messages to the ambari-server when it is started.
So if you do not want your old "worker11" not to send registration request to the Ambari-Server then please stop the ambari-agent on that host. or remove the ambari-server hostname entry from the "/etc/ambari-agent/conf.ambari-agent.ini" file of that particular agent.
# ambari-agent stop # mv /etc/ambari-agent/confambari-agent.ini /etc/ambari-agent/confambari-agent.ini.unwanted # ambari-agent stop (just to make sure that stop is already performed)
.
If above does not work then as an alternate option we might need to delete the Old worker11 host entry from the DataBase directly.
Created 11-26-2017 01:14 PM
@Jay , first thanks a lot for the great support , actually we solved it by re-configure the worker IP with the previous IP , and then restart the worker host , after server go's , data-node show alive on all workers and worker is part of the cluster