Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Ambari Services Error in Cluster

avatar
Contributor

Hi All,

When I go to the Cluster's services, most of the services failed. When I tried to restart each services, it took a long time to start it. Help would be appreciated.

services-error-cluster.png

Best Regards

David Yee

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi @David Yee,

Below are two solution i can suggest to resolve your problem -

1. You can use "Private IP address" instead of public ip address [ie. 52.77.231.10] in /etc/hosts file. No more changes will be required [assuming you have use hostname while registering hosts with ambari]. Changing the hosts file will take an effect to get service started, if not make sure you restart the cluster for changes to take effect. PFA screenshot.

1579-ec2-private-ip-address.png

2. You can use Elastic IP for your ec2 instances, so that the ip will never change even if your instance gets restarted. You can see the below link for assigning elastic ip to ec2-instances -

http://docs.aws.amazon.com/AmazonVPC/latest/GettingStartedGuide/getting-started-assign-eip.html

[Note: Elastic ip are not free.Make sure you check AWS costing model for the same.]

View solution in original post

16 REPLIES 16

avatar
Master Mentor

@David Yee HDFS is on safe mode and then it will start the rest. This is by design, please post any failed service by showing the logs.

avatar
Contributor

Hi Artem,

When I tried to start all the components. It fails on the "NameNode Start" step and the following error occurs.

resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start namenode'' returned 1. starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-ip-172-30-1-137.ap-southeast-1.compute.internal.out

Best Regards

David

avatar
Master Mentor
@David Yee

please post output of /var/log/hadoop/hdfs/hdfs-namenode log

avatar
Contributor

Hi Artem,

I hope I am sending you the right file.

hadoop-hdfs-namenode-ip-172-30-1-137ap-southeast-1.zip

Best Regards

David Yee

avatar
Contributor

Hi Artem,

Can you elaborate more on this?

Best Regards

David

avatar
Master Mentor
@David Yee

Login to namenode host and run this hadoop dfsadmin -safemode leave

then

1) Bring up the core i,e HDFS, YARN and MapReduce

2) Bring up other services like Hive and so on

avatar
Contributor

[root@ip-172-30-1-137 sbin]# hdfs dfsadmin -safemode leave safemode: Call From ip-172-30-1-137.ap-southeast-1.compute.internal/52.77.231.10 to ip-172-30-1-137.ap-southeast-1.compute.internal:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

I am getting the following error message

avatar
Master Mentor

@David Yee Start the namenode process and see what happens.

avatar
Contributor

[root@ip-172-30-1-137 sbin]# hadoop dfsadmin -safemode leave DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it.

safemode: Call From ip-172-30-1-137.ap-southeast-1.compute.internal/52.77.231.10 to ip-172-30-1-137.ap-southeast-1.compute.internal:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

your command return this error that's the reason I am using hdfs instead.