In my production hadoop cluster is having 2 head nodes. yesterday failover has happened from head node 1 to head node 0.
Now Head node 0 is in active mode and head node 1 is in stand by mode.after this fail-over all the services were up and running in head node 0(Active) but in head node 1 all the services were in stopped state for the long time(Approximately 1 hour).
Then i have manually started all the services through Ambari. now all the services are up and running in both the head nodes.
before starting the services i have checked the below things:
1) i used ping command, to know whether the head node 1 able to communicate with other nodes in the cluster ---- Result : Success.
2)i have checked the Ambari-agent status, to know whether the agent is running in head node-1 or not -------Result : Success.
3) i have checked the hive connectivity through beeline, using ODBC connection string. result--- i was able to connect.
As i m a fresher and i m new to this hadoop world, i didn't have much knowledge in Trouble shooting. i m learning day by day.
i would like to know the reason behind this issue.
Appreciate if anybody can provide the correct explanation about this issue.
Thanks in Advance!
Below Services has been started manually by me:
1) Standby Name Node.
2) Standby Resource Manager.
3) Zookeeper Failover Controller.
8)Amabri metrics monitor
... View more
One of my team member has deleted the timeline server log file in the production Hadoop cluster which has been available in the following path : /var/log/Hadoop-yarn/yarn/. he has used the following command rm yarn-yarn-timelineserver-hn0-myprha.log in the head node 0 of the Hadoop cluster. we wanted to recover the log file back to the same path. can anybody help on this case? Thanks in Advance!
... View more