Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

All the Hadoop Services like Name node,resource manager,Zookeeper failover controller are stopped in one of the head node in my Hadoop cluster

All the Hadoop Services like Name node,resource manager,Zookeeper failover controller are stopped in one of the head node in my Hadoop cluster

New Contributor

In my production hadoop cluster is having 2 head nodes. yesterday failover has happened from head node 1 to head node 0.

 

Now Head node 0 is in active mode and head node 1 is in stand by mode.after this fail-over all the services were up and running in head node 0(Active) but in head node 1 all the services were in stopped state for the long time(Approximately 1 hour).

 

Then i have manually started all the services through Ambari. now all the services are up and running in both the head nodes.

 

before starting the services i have checked the below things:

 

1) i used ping command, to know whether the head node 1 able to communicate with other nodes in the cluster ---- Result : Success.

2)i have checked the Ambari-agent status, to know whether the agent is running in head node-1 or not -------Result : Success.

3) i have checked the hive connectivity through beeline, using ODBC connection string. result--- i was able to connect.

 

As i m a fresher and i m new to this hadoop world, i didn't have much knowledge in Trouble shooting. i m learning day by day.

 

i would like to know the reason behind this issue.

 

Appreciate if anybody can provide the correct explanation about this issue. 

 

Thanks in Advance!

 

Below Services has been started manually by me:

 

1) Standby Name Node.

2) Standby Resource Manager.

3) Zookeeper Failover Controller.

4)Hive Server2

5)Hive metastore.

6)Oozie.

7)WebHcat Server.

8)Amabri metrics monitor

1 REPLY 1

Re: All the Hadoop Services like Name node,resource manager,Zookeeper failover controller are stopped in one of the head node in my Hadoop cluster

Rising Star

@ss00552277 - Did you check the logs for those services and found any error messages? Example for hive check the "/var/log/hive/" log directory.

Also I will suggest you to check the same on your host also. As all your services was down on one host, there might be possibility of some issues from the OS side also. Just check the /var/log/message and dmesg on the host.

Don't have an account?
Coming from Hortonworks? Activate your account here