Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Metrics System unable to initialize HA controller

Explorer

I clean up all the data about AMS from my cluster. Include the node on the zookeeper(ams-hbase-secure) except /ambari-metrics-cluster. Because after rmr the /ambari-metrics-cluster, it will be created again and I have no any idea about the creater.

Then, I reinstall the AMS and Metric Collector start failed.

I got this ERROR:

2019-01-09 15:04:20,257 INFO org.apache.helix.manager.zk.ZKHelixAdmin: Cluster ambari-metrics-cluster already exists
2019-01-09 15:04:40,287 WARN org.apache.helix.manager.zk.ZKHelixAdmin: Root directory exists.Cleaning the root directory:/ambari-metrics-cluster
org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.MetricsSystemInitializationException: Unable to initialize HA controller
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:84)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:137)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:147)
        at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68)
        at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:791)
        at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:786)
        at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:497)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:115)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:125)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
        at org.I0Itec.zkclient.ZkConnection.delete(ZkConnection.java:104)
        at org.apache.helix.manager.zk.ZkClient$8.call(ZkClient.java:351)
        at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:990)
        ... 13 more
org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.MetricsSystemInitializationException: Unable to initialize HA controller
        at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:118)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.serviceInit(HBaseTimelineMetricStore.java:96)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:84)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:137)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:147)
        at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68)
        at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:1000)
        at org.apache.helix.manager.zk.ZkClient.delete(ZkClient.java:347)
        at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:791)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:115)
        ... 7 more
Caused by: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /ambari-metrics-cluster/CONTROLLER
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:125)
        at org.I0Itec.zkclient.ZkConnection.delete(ZkConnection.java:104)
        at org.apache.helix.manager.zk.ZkClient$8.call(ZkClient.java:351)
        at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:990)
        ... 13 more
2019-01-09 15:04:40,305 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping phoenix metrics system...
2019-01-09 15:04:40,305 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: phoenix metrics system stopped.
2019-01-09 15:04:40,305 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: phoenix metrics system shutdown complete.
2019-01-09 15:04:40,306 INFO org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl: Stopping ApplicationHistory
2019-01-09 15:04:40,306 FATAL org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: Error starting ApplicationHistoryServer
org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.MetricsSystemInitializationException: Unable to initialize HA controller
        at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:118)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.serviceInit(HBaseTimelineMetricStore.java:96)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:84)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:147)
Caused by: org.I0Itec.zkclient.exception.ZkException: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /ambari-metrics-clust
er/CONTROLLER        at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68)
        at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:1000)
        at org.apache.helix.manager.zk.ZkClient.delete(ZkClient.java:347)
        at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:791)
        at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:786)
        at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:497)
        at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.availability.MetricCollectorHAController.initializeHAController(MetricCollectorHAControll
er.java:156)        at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:115)
        ... 7 more
Caused by: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /ambari-metrics-cluster/CONTROLLER
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:125)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
        at org.I0Itec.zkclient.ZkConnection.delete(ZkConnection.java:104)
        at org.apache.helix.manager.zk.ZkClient$8.call(ZkClient.java:351)
        at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:990)
        ... 13 more
2019-01-09 15:04:40,307 INFO org.apache.hadoop.util.ExitUtil: Exiting with status -1
2019-01-09 15:04:40,314 INFO org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down ApplicationHistoryServer at test-da-shanghai-03/192.168.2.187
************************************************************/
2019-01-09 15:04:40,330 WARN org.apache.hadoop.hbase.io.util.HeapMemorySizeUtil: hbase.regionserver.global.memstore.upperLimit is deprecated by hbase.regionserver.global.mem
store.size
2 REPLIES 2

Explorer

Problem solved.
I run the shell "ps -ef |grep ambari-metri" find out many previous process which creater are "ams".

After killed these process,I rmr the znode named /ambari-metrics-cluster and this node isn't created again, haha.

Expert Contributor

Please stop the collector, clean up the /ambari-metrics-cluster zndoe as well and start. Alternately, you can set custom ams-site : timeline.metrics.service.distributed.collector.mode.disabled = false.