Support Questions
Find answers, ask questions, and share your expertise

In ambari all my servies are down

Explorer

all my services are down

 

HDFS

DataNode Process

CRIT

 for 35 minutes

Connection failed: [Errno 111] Connection refused to gaian-lap386.com:50010

HDFS

NFS Gateway Process

CRIT

 for 35 minutes

Connection failed: [Errno 111] Connection refused to gaian-lap386.com:2049

HDFS

NameNode Web UI

CRIT

 for 35 minutes

Connection failed to http://gaian-lap386.com:50070 (urlopen error timed out)

HDFS

Secondary NameNode Process

CRIT

 for 35 minutes

Connection failed to http://gaian-lap386.com:50090 (urlopen error timed out)

HDFS

DataNode Web UI

CRIT

 for 34 minutes

Connection failed to http://gaian-lap386.com:50075 (urlopen error timed out)

2 REPLIES 2

Re: In ambari all my servies are down

Super Mentor

@Manoj690 
All these are "Connection failed to xxxx" messages, Which are the effect of the issue , Not the actual issue. 

You will need to check the individual components logs to find out if they are throwing any specific error/exception.

 

For example

Please check if the NameNode is running or not? 

# ps -ef | grep -i NameNode

If NameNode port is listening on 50070? 

# netstat -tnlpa | grep 50070

 

If not then please check and share the NameNode log. If possible then try to attach the NameNode/DataNode logs here.

 

.

.

Then please repeat the same steps of troubleshooting for DataNode and other processes like NFS Gateway....

Re: In ambari all my servies are down

Explorer

this is the error

 

2019-08-21 16:13:10,828 - The NameNode is still in Safemode. Please be careful with commands that need Safemode OFF. Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HDFS/package/scripts/namenode.py", line 408, in <module>NameNode().execute() File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute method(env) File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HDFS/package/scripts/namenode.py", line 138, in startupgrade_suspended=params.upgrade_suspended, env=env) File "/usr/lib/ambari-agent/lib/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HDFS/package/scripts/hdfs_namenode.py", line 264, in namenode create_hdfs_directories(name_service) File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HDFS/package/scripts/hdfs_namenode.py", line 336, in create_hdfs_directories nameservices=name_services File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in runself.run_action(resource, action) File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py", line 677, in action_create_on_execute self.action_delayed("create") File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py", line 674, in action_delayedself.get_hdfs_resource_executor().action_delayed(action_name, self) File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py", line 373, in action_delayedself.action_delayed_for_nameservice(None, action_name, main_resource) File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py", line 395, in action_delayed_for_nameservice self._assert_valid() File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py", line 334, in _assert_validself.target_status = self._get_file_status(target) File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py", line 497, in _get_file_statuslist_status = self.util.run_command(target, 'GETFILESTATUS', method='GET', ignore_status_codes=['404'], assertable_result=False) File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py", line 214, in run_command return self._run_command(*args, **kwargs) File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py", line 282, in _run_command _, out, err = get_user_call_output(cmd, user=self.run_user, logoutput=self.logoutput, quiet=False) File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/get_user_call_output.py", line 62, in get_user_call_output raise ExecutionFailed(err_msg, code, files_output[0], files_output[1])resource_management.core.exceptions.ExecutionFailed: Execution of 'curl -sS -L -w '%{http_code}' -X GET -d '' -H 'Content-Length: 0' 'http://gaian-lap386.com:50070/webhdfs/v1/tmp?op=GETFILESTATUS&user.name=hdfs' 1>/tmp/tmpPLc7iw 2>/tmp/tmpCp00V8' returned 7. curl: (7) Failed to connect to gaian-lap386.com port 50070: Connection refused 000 

 

After i use this safemode leave command below result i got

 

hdfs dfsadmin -safemode leave
safemode: Call From gaian-lap386.com/192.168.24.32 to gaian-lap386.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused