Created 08-05-2018 04:10 PM
Created 08-05-2018 04:11 PM
Need the help of Geoffrey Shelton Okot
Created 08-05-2018 06:47 PM
Please post more details. At a minimum, click on 'start operation', and paste the logs that it shows there for any operation. Does start of an individual service work?
Created 08-06-2018 11:53 AM
@Tusar Mohanty if you are having major issues such as seen in your screen shot, you will need to do some troubleshooting of the actual errors. Here are a few ideas to try:
It is important to note that when you click into modals for 1 & 2, there are always deeper clicks, which open additional modals. The deeper you go the more you will dial into specific host and component errors and the messaging necessary to troubleshoot further.
If this answer helps, please choose ACCEPT.
Created 08-06-2018 01:54 PM
Clearly you have some major issue which has caused all services to fail to start, but the screenshot is not enough to start any troubleshooting here.
Few question here:
1. I understand that this cluster is build on AWS instance. Can you make sure all the firewall/network setting are appropriate for AWS ? This is the first step.
2. Can you try to start ZooKeeper Server and HDFS services manually via command line from their respective nodes? While doing this, capture their Logs.
3. Provide the ambari.server, and ambari-agent logs.
4. Are the nodes in this cluster able to communicate with each other?
5. What is the ambari and HDP version?
The above will help to get more idea on what is happening on this cluster?
Created 08-07-2018 02:54 PM
@Ravi....
Hi Ravi,
HDP 2.5 with Ambari 2.4.2.0.
I have installed three servers under vmware with name server1.tusar.com , server2.. and server3.
The firewall and selinux is disabled .
[root@server1 ~]# hdfs datanode & [1] 3639 [root@server1 ~]# Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file /v ar/log/hadoop/root/gc.log-201808070737 due to No such file or directory 18/08/07 07:37:31 INFO datanode.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: user = hdfs STARTUP_MSG: host = server1.tusar.com/192.168.107.137 STARTUP_MSG: args = [] STARTUP_MSG: version = 2.7.3.2.5.0.0-1245 18/08/07 07:37:44 INFO ipc.Server: IPC Server Responder: starting 18/08/07 07:37:44 INFO ipc.Server: IPC Server listener on 8010: starting 18/08/07 07:37:49 INFO ipc.Client: Retrying connect to server: server2.tusar.com/192.168.107.138:8020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/08/07 07:38:35 WARN ipc.Client: Failed to connect to server: server2.tusar.com/192.168.107.138:8020: retries get failed due to exceeded maximum allowed retries number:
[root@server1 ~]# hdfs zkfc & [1] 3763 [root@server1 ~]# 18/08/07 07:40:52 INFO tools.DFSZKFailoverController: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DFSZKFailoverController STARTUP_MSG: user = root STARTUP_MSG: host = server1.tusar.com/192.168.107.137 STARTUP_MSG: args = [] STARTUP_MSG: version = 2.7.3.2.5.0.0-1245
STARTUP_MSG: build = git@github.com:hortonworks/hadoop.git -r cb6e514b14fb60e9995e5ad9543315cd404b4e59; compiled by 'jenkins' on 2016-08-26T00:55Z STARTUP_MSG: java = 1.8.0_77 ************************************************************/ 18/08/07 07:40:52 INFO tools.DFSZKFailoverController: registered UNIX signal handlers for [TERM, HUP, INT] Exception in thread "main" org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.create(DFSZKFailoverController.java:121) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:179) 18/08/07 07:40:52 INFO tools.DFSZKFailoverController: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DFSZKFailoverController at server1.tusar.com/192.168.107.137 ************************************************************/ [1]+ Exit 1 hdfs zkfc
Created 08-07-2018 02:57 PM
[root@server1 ~]# tail /var/log/ambari-server/ambari-server.log 2018-08-07 07:47:58,914 - Retrieved 'capacity-scheduler' received as dictionary : 'False'. configs : [('', ''), ('yarn.scheduler.capacity.root.accessible-node-labels', '*'), ('yarn.scheduler.capacity.root.Prodqueue.capacity', '50'), ('yarn.scheduler.capacity.maximum-am-resource-percent', '0.2'), ('yarn.scheduler.capacity.root.default.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Devqueue.capacity', '20'), ('yarn.scheduler.capacity.root.Devqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.testqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.Prodqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.root.Devqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.capacity', '100'), ('yarn.scheduler.capacity.root.Prodqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.default.state', 'RUNNING'), ('yarn.scheduler.capacity.root.testqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Prodqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.Devqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.testqueue.capacity', '20'), ('yarn.scheduler.capacity.root.testqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.node-locality-delay', '40'), ('yarn.scheduler.capacity.root.Prodqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.queues', 'Devqueue,Prodqueue,default,testqueue'), ('yarn.scheduler.capacity.maximum-applications', '10000'), ('yarn.scheduler.capacity.root.default.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.testqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.testqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.root.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.Devqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.Prodqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.root.testqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.Prodqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.testqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.Prodqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Devqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.root.Devqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.default.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.default.capacity', '10'), ('yarn.scheduler.capacity.root.Devqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.queue-mappings-override.enable', 'false'), ('yarn.scheduler.capacity.resource-calculator', 'org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator')] 2018-08-07 07:47:58,915 - 'capacity-scheduler' configs is passed-in as a single '\n' separated string. count(services['configurations']['capacity-scheduler']['properties']['capacity-scheduler']) = 39 2018-08-07 07:47:58,915 - Retrieved 'capacity-scheduler' received as dictionary : 'False'. configs : [('', ''), ('yarn.scheduler.capacity.root.accessible-node-labels', '*'), ('yarn.scheduler.capacity.root.Prodqueue.capacity', '50'), ('yarn.scheduler.capacity.maximum-am-resource-percent', '0.2'), ('yarn.scheduler.capacity.root.default.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Devqueue.capacity', '20'), ('yarn.scheduler.capacity.root.Devqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.testqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.Prodqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.root.Devqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.capacity', '100'), ('yarn.scheduler.capacity.root.Prodqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.default.state', 'RUNNING'), ('yarn.scheduler.capacity.root.testqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Prodqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.Devqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.testqueue.capacity', '20'), ('yarn.scheduler.capacity.root.testqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.node-locality-delay', '40'), ('yarn.scheduler.capacity.root.Prodqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.queues', 'Devqueue,Prodqueue,default,testqueue'), ('yarn.scheduler.capacity.maximum-applications', '10000'), ('yarn.scheduler.capacity.root.default.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.testqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.testqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.root.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.Devqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.Prodqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.root.testqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.Prodqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.testqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.Prodqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Devqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.root.Devqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.default.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.default.capacity', '10'), ('yarn.scheduler.capacity.root.Devqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.queue-mappings-override.enable', 'false'), ('yarn.scheduler.capacity.resource-calculator', 'org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator')] 2018-08-07 07:47:58,916 - 'capacity-scheduler' configs is passed-in as a single '\n' separated string. count(services['configurations']['capacity-scheduler']['properties']['capacity-scheduler']) = 39 2018-08-07 07:47:58,916 - Retrieved 'capacity-scheduler' received as dictionary : 'False'. configs : [('', ''), ('yarn.scheduler.capacity.root.accessible-node-labels', '*'), ('yarn.scheduler.capacity.root.Prodqueue.capacity', '50'), ('yarn.scheduler.capacity.maximum-am-resource-percent', '0.2'), ('yarn.scheduler.capacity.root.default.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Devqueue.capacity', '20'), ('yarn.scheduler.capacity.root.Devqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.testqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.Prodqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.root.Devqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.capacity', '100'), ('yarn.scheduler.capacity.root.Prodqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.default.state', 'RUNNING'), ('yarn.scheduler.capacity.root.testqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Prodqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.Devqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.testqueue.capacity', '20'), ('yarn.scheduler.capacity.root.testqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.node-locality-delay', '40'), ('yarn.scheduler.capacity.root.Prodqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.queues', 'Devqueue,Prodqueue,default,testqueue'), ('yarn.scheduler.capacity.maximum-applications', '10000'), ('yarn.scheduler.capacity.root.default.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.testqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.testqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.root.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.Devqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.Prodqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.root.testqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.Prodqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.testqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.Prodqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Devqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.root.Devqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.default.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.default.capacity', '10'), ('yarn.scheduler.capacity.root.Devqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.queue-mappings-override.enable', 'false'), ('yarn.scheduler.capacity.resource-calculator', 'org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator')] 2018-08-07 07:47:58,917 - 'capacity-scheduler' configs is passed-in as a single '\n' separated string. count(services['configurations']['capacity-scheduler']['properties']['capacity-scheduler']) = 39 2018-08-07 07:47:58,917 - Retrieved 'capacity-scheduler' received as dictionary : 'False'. configs : [('', ''), ('yarn.scheduler.capacity.root.accessible-node-labels', '*'), ('yarn.scheduler.capacity.root.Prodqueue.capacity', '50'), ('yarn.scheduler.capacity.maximum-am-resource-percent', '0.2'), ('yarn.scheduler.capacity.root.default.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Devqueue.capacity', '20'), ('yarn.scheduler.capacity.root.Devqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.testqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.Prodqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.root.Devqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.capacity', '100'), ('yarn.scheduler.capacity.root.Prodqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.default.state', 'RUNNING'), ('yarn.scheduler.capacity.root.testqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Prodqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.Devqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.testqueue.capacity', '20'), ('yarn.scheduler.capacity.root.testqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.node-locality-delay', '40'), ('yarn.scheduler.capacity.root.Prodqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.queues', 'Devqueue,Prodqueue,default,testqueue'), ('yarn.scheduler.capacity.maximum-applications', '10000'), ('yarn.scheduler.capacity.root.default.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.testqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.testqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.root.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.Devqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.Prodqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.root.testqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.Prodqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.testqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.Prodqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Devqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.root.Devqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.default.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.default.capacity', '10'), ('yarn.scheduler.capacity.root.Devqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.queue-mappings-override.enable', 'false'), ('yarn.scheduler.capacity.resource-calculator', 'org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator')] 2018-08-07 07:47:58,919 - 'capacity-scheduler' configs is passed-in as a single '\n' separated string. count(services['configurations']['capacity-scheduler']['properties']['capacity-scheduler']) = 39 2018-08-07 07:47:58,920 - Retrieved 'capacity-scheduler' received as dictionary : 'False'. configs : [('', ''), ('yarn.scheduler.capacity.root.accessible-node-labels', '*'), ('yarn.scheduler.capacity.root.Prodqueue.capacity', '50'), ('yarn.scheduler.capacity.maximum-am-resource-percent', '0.2'), ('yarn.scheduler.capacity.root.default.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Devqueue.capacity', '20'), ('yarn.scheduler.capacity.root.Devqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.testqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.Prodqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.root.Devqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.capacity', '100'), ('yarn.scheduler.capacity.root.Prodqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.default.state', 'RUNNING'), ('yarn.scheduler.capacity.root.testqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Prodqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.Devqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.testqueue.capacity', '20'), ('yarn.scheduler.capacity.root.testqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.node-locality-delay', '40'), ('yarn.scheduler.capacity.root.Prodqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.queues', 'Devqueue,Prodqueue,default,testqueue'), ('yarn.scheduler.capacity.maximum-applications', '10000'), ('yarn.scheduler.capacity.root.default.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.testqueue.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.testqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.root.acl_administer_queue', '*'), ('yarn.scheduler.capacity.root.Devqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.Prodqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.root.testqueue.ordering-policy', 'fifo'), ('yarn.scheduler.capacity.root.Prodqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.testqueue.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.Prodqueue.maximum-capacity', '100'), ('yarn.scheduler.capacity.root.Devqueue.state', 'RUNNING'), ('yarn.scheduler.capacity.root.Devqueue.user-limit-factor', '1'), ('yarn.scheduler.capacity.root.default.acl_submit_applications', '*'), ('yarn.scheduler.capacity.root.default.capacity', '10'), ('yarn.scheduler.capacity.root.Devqueue.minimum-user-limit-percent', '100'), ('yarn.scheduler.capacity.queue-mappings-override.enable', 'false'), ('yarn.scheduler.capacity.resource-calculator', 'org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator')] 07 Aug 2018 07:47:58,947 INFO [ambari-client-thread-494] StackAdvisorRunner:71 - advisor script stderr: [root@server1 ~]#
Created 08-07-2018 02:58 PM
@Ravi
@Ravi[root@server1 ~]# tail /var/log/ambari-agent/ambari-agent.log
WARNING 2018-08-07 10:43:20,596 NetUtil.py:116 - Server at https://server1.tusar.com:8440 is not reachable, sleeping for 10 seconds...
INFO 2018-08-07 10:43:30,597 NetUtil.py:62 - Connecting to https://server1.tusar.com:8440/ca
WARNING 2018-08-07 10:43:30,597 NetUtil.py:93 - Failed to connect to https://server1.tusar.com:8440/ca due to [Errno 111] Connection refused
WARNING 2018-08-07 10:43:30,597 NetUtil.py:116 - Server at https://server1.tusar.com:8440 is not reachable, sleeping for 10 seconds...
INFO 2018-08-07 10:43:40,598 NetUtil.py:62 - Connecting to https://server1.tusar.com:8440/ca
WARNING 2018-08-07 10:43:40,599 NetUtil.py:93 - Failed to connect to https://server1.tusar.com:8440/ca due to [Errno 111] Connection refused
WARNING 2018-08-07 10:43:40,599 NetUtil.py:116 - Server at https://server1.tusar.com:8440 is not reachable, sleeping for 10 seconds...
INFO 2018-08-07 10:43:50,600 NetUtil.py:62 - Connecting to https://server1.tusar.com:8440/ca
WARNING 2018-08-07 10:43:50,600 NetUtil.py:93 - Failed to connect to https://server1.tusar.com:8440/ca due to [Errno 111] Connection refused
WARNING 2018-08-07 10:43:50,600 NetUtil.py:116 - Server at https://server1.tusar.com:8440 is not reachable, sleeping for 10 seconds...
[root@server1 ~]#
Created 08-07-2018 02:59 PM
@Ravi
@Ravi[root@server1 ~]# ssh server2
Last login: Sun Aug 5 08:51:04 2018 from server1.tusar.com
[root@server2 ~]# exit
logout
Connection to server2 closed.
[root@server1 ~]# ssh server3
Last login: Sun Aug 5 08:51:16 2018 from server1.tusar.com
[root@server3 ~]#
Created 08-07-2018 03:58 PM