Created on 01-09-2019 08:40 AM - edited 09-16-2022 07:03 AM
I clean up all the data about AMS from my cluster. Include the node on the zookeeper(ams-hbase-secure) except /ambari-metrics-cluster. Because after rmr the /ambari-metrics-cluster, it will be created again and I have no any idea about the creater.
Then, I reinstall the AMS and Metric Collector start failed.
I got this ERROR:
2019-01-09 15:04:20,257 INFO org.apache.helix.manager.zk.ZKHelixAdmin: Cluster ambari-metrics-cluster already exists 2019-01-09 15:04:40,287 WARN org.apache.helix.manager.zk.ZKHelixAdmin: Root directory exists.Cleaning the root directory:/ambari-metrics-cluster org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.MetricsSystemInitializationException: Unable to initialize HA controller at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:84) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:137) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:147) at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68) at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:791) at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:786) at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:497) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:125) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) at org.I0Itec.zkclient.ZkConnection.delete(ZkConnection.java:104) at org.apache.helix.manager.zk.ZkClient$8.call(ZkClient.java:351) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:990) ... 13 more org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.MetricsSystemInitializationException: Unable to initialize HA controller at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:118) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.serviceInit(HBaseTimelineMetricStore.java:96) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:84) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:137) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:147) at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:1000) at org.apache.helix.manager.zk.ZkClient.delete(ZkClient.java:347) at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:791) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:115) ... 7 more Caused by: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /ambari-metrics-cluster/CONTROLLER at org.apache.zookeeper.KeeperException.create(KeeperException.java:125) at org.I0Itec.zkclient.ZkConnection.delete(ZkConnection.java:104) at org.apache.helix.manager.zk.ZkClient$8.call(ZkClient.java:351) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:990) ... 13 more 2019-01-09 15:04:40,305 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping phoenix metrics system... 2019-01-09 15:04:40,305 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: phoenix metrics system stopped. 2019-01-09 15:04:40,305 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: phoenix metrics system shutdown complete. 2019-01-09 15:04:40,306 INFO org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl: Stopping ApplicationHistory 2019-01-09 15:04:40,306 FATAL org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: Error starting ApplicationHistoryServer org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.MetricsSystemInitializationException: Unable to initialize HA controller at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:118) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.serviceInit(HBaseTimelineMetricStore.java:96) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:84) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:147) Caused by: org.I0Itec.zkclient.exception.ZkException: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /ambari-metrics-clust er/CONTROLLER at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:1000) at org.apache.helix.manager.zk.ZkClient.delete(ZkClient.java:347) at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:791) at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:786) at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:497) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.availability.MetricCollectorHAController.initializeHAController(MetricCollectorHAControll er.java:156) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:115) ... 7 more Caused by: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /ambari-metrics-cluster/CONTROLLER at org.apache.zookeeper.KeeperException.create(KeeperException.java:125) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) at org.I0Itec.zkclient.ZkConnection.delete(ZkConnection.java:104) at org.apache.helix.manager.zk.ZkClient$8.call(ZkClient.java:351) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:990) ... 13 more 2019-01-09 15:04:40,307 INFO org.apache.hadoop.util.ExitUtil: Exiting with status -1 2019-01-09 15:04:40,314 INFO org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down ApplicationHistoryServer at test-da-shanghai-03/192.168.2.187 ************************************************************/ 2019-01-09 15:04:40,330 WARN org.apache.hadoop.hbase.io.util.HeapMemorySizeUtil: hbase.regionserver.global.memstore.upperLimit is deprecated by hbase.regionserver.global.mem store.size
Created 01-09-2019 08:40 AM
Problem solved.
I run the shell "ps -ef |grep ambari-metri" find out many previous process which creater are "ams".
After killed these process,I rmr the znode named /ambari-metrics-cluster and this node isn't created again, haha.
Created 01-10-2019 01:23 PM
Please stop the collector, clean up the /ambari-metrics-cluster zndoe as well and start. Alternately, you can set custom ams-site : timeline.metrics.service.distributed.collector.mode.disabled = false.