Created on 11-18-2016 11:13 AM - edited 09-16-2022 03:48 AM
Hi Team,
Can anyone help me on this issue , as I am getting error like below logs at "Cluster Install wizard" at ambari dashboard.
Basically I am trying to make cluster of two nodes one is host server and other will be agent , I can able to register server node
and able to see log mesage at dashboard like
Registering with the server... Registering with the server...
But second node (agent node ) is taking around 300 sec at installing.. message and after 300 sec it is giving below failure message at dashboard, (please refer below logs)
It means our two machines can able to connect and can do password less ssh vice-versa and also . fqdn are set , ulimit are set, firewall are disabled.
Can anyone help on this,
========================== Creating target directory... ========================== Command start time 2016-11-18 15:14:34 chmod: cannot access ‘/var/lib/ambari-agent/data’: No such file or directory Connection to impetus-n362u.impetus.co.in closed. SSH command execution finished host=impetus-n362u.impetus.co.in, exitcode=0 Command end time 2016-11-18 15:14:35 ========================== Copying common functions script... ========================== Command start time 2016-11-18 15:14:35 scp /usr/lib/python2.6/site-packages/ambari_commons host=impetus-n362u.impetus.co.in, exitcode=0 Command end time 2016-11-18 15:14:35 ========================== Copying OS type check script... ========================== Command start time 2016-11-18 15:14:35 scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py host=impetus-n362u.impetus.co.in, exitcode=0 Command end time 2016-11-18 15:14:35 ========================== Running OS type check... ========================== Command start time 2016-11-18 15:14:35 Cluster primary/cluster OS family is ubuntu14 and local/current OS family is ubuntu14 Connection to impetus-n362u.impetus.co.in closed. SSH command execution finished host=impetus-n362u.impetus.co.in, exitcode=0 Command end time 2016-11-18 15:14:36 ========================== Checking 'sudo' package on remote host... ========================== Command start time 2016-11-18 15:14:36 sudo install Connection to impetus-n362u.impetus.co.in closed. SSH command execution finished host=impetus-n362u.impetus.co.in, exitcode=0 Command end time 2016-11-18 15:14:36 ========================== Copying repo file to 'tmp' folder... ========================== Command start time 2016-11-18 15:14:36 scp /etc/apt/sources.list.d/ambari.list host=impetus-n362u.impetus.co.in, exitcode=0 Command end time 2016-11-18 15:14:36 ========================== Moving file to repo dir... ========================== Command start time 2016-11-18 15:14:36 Connection to impetus-n362u.impetus.co.in closed. SSH command execution finished host=impetus-n362u.impetus.co.in, exitcode=0 Command end time 2016-11-18 15:14:37 ========================== Changing permissions for ambari.repo... ========================== Command start time 2016-11-18 15:14:37 Connection to impetus-n362u.impetus.co.in closed. SSH command execution finished host=impetus-n362u.impetus.co.in, exitcode=0 Command end time 2016-11-18 15:14:37 ========================== Update apt cache of repository... ========================== Command start time 2016-11-18 15:14:37 Automatic Agent registration timed out (timeout = 300 seconds). Check your network connectivity and retry registration, or use manual agent registration.
Created 11-18-2016 09:47 PM
It seems like there is problem with apt-cache command.
Try running apt-cache ambari-server on the server where it is failing and see the command completes.
You may to check the proxy settings if you are using public repos.
Once you resolve this, you should be able to register the ambari agent.
Created 11-18-2016 09:47 PM
It seems like there is problem with apt-cache command.
Try running apt-cache ambari-server on the server where it is failing and see the command completes.
You may to check the proxy settings if you are using public repos.
Once you resolve this, you should be able to register the ambari agent.
Created 11-19-2016 09:09 AM
Thanks for your response , I am not using public repos , I have made local repos under /var/www/html location at ambari server host in Ubuntu , is there any problem with networking or proxy ?? But both machines are able to ping each other also able to perform password less login both ways
Created 11-19-2016 07:51 PM
Its very evident that there is a problem with your local repository.
The last failed task is" Update apt cache of repository..." . Can you try running " apt-cache ambari-sever " on the server and see if the command completes?
Created 11-20-2016 09:18 AM
Thanks @rgangappa for your answer ,
From the logs of successful registration of ambari-server node ( as I have mentioned in my above query that I can able to register one node out of two ie my ambari-server host) , I can see apt-cache is running fine and was fully completed with successful registration of server node, but second node is taking so much time at apt-cache step after 300 sec it is timed out , please give me some pointers
Is there any case of networking ?
I have disabled proxy and firewall too
Created 11-22-2016 05:14 AM
Thanks guys for your active support , It was my network proxy issue , I disabled that n now whole registration of host and services are completed ,
But there is one issue here that I have installed almost service around 12-13 service like mapreduce , hive , spark, pig etc so on , there is some issue that services are toggling their state from start --> stop , automatically , but when I manually start thier corresponding services they will start again , is there any specific reason,
In logs it is just showing connection to host:port is failed..
Created 11-20-2016 12:40 PM
May be you can Try to increase this value to some higher value "/usr/lib/python2.6/site-packages/ambari_server/bootstrap.py" . Default value is "300". So you should try a little higher value (like 600). If that also does not solve the issue which means there is a Serious N/W issue which is causing the n/w slowness and you should address that.
Example:
HOST_BOOTSTRAP_TIMEOUT = 600