Support Questions
Find answers, ask questions, and share your expertise

Configuring Ambari Metrics Collector

Configuring Ambari Metrics Collector

Super Collaborator

We're experiencing periodic crashing of our Ambari Metrics Collector.

This only started happening after upgrading from HDP 2.2.8 to 2.3.2.

There are some warnings in the logs that look interesting: "TimelineClusterAggregatorSecond:131 - Last Checkpoint is too

old, discarding last checkpoint"

Can anyone provide some guidance on how we might better tune our Metrics Collector service to improve this?

10:30:06,487  WARN
[pool-7-thread-1] TimelineClusterAggregatorSecond:131 - Last Checkpoint is too
old, discarding last checkpoint. lastCheckPointTime = Fri May 27 10:24:00 EDT
	10:30:06,487  INFO
[pool-7-thread-1] TimelineClusterAggregatorSecond:134 - Saving checkpoint time.
Fri May 27 10:28:00 EDT 2016
	10:30:06,487  INFO
[Thread-4] ZooKeeper:684 - Session: 0x254c9555ca7f744 closed
	10:30:06,487  INFO
[main-EventThread] ClientCnxn:524 - EventThread shut down
	10:30:06,488  INFO
[pool-7-thread-1] TimelineClusterAggregatorSecond:106 - Last check point time:
1464359280000, lagBy: 126 seconds.
	10:30:06,488  INFO
[pool-7-thread-1] TimelineClusterAggregatorSecond:211 - Start aggregation cycle
@ Fri May 27 10:30:06 EDT 2016, startTime = Fri May 27 10:28:00 EDT 2016,
endTime = Fri May 27 10:30:00 EDT 2016
	10:31:56,528  INFO
[main] ApplicationHistoryServer:45 - STARTUP_MSG:

Re: Configuring Ambari Metrics Collector

Expert Contributor

Zack, what version of Ambari is this?

Do you see a a SIGTERM 15 everytime before the AMS crashes and restarts?

If yes, please take a look at my recommendation here -

Re: Configuring Ambari Metrics Collector

@Zack Riesland Post HDP upgrade, I suspect the hadoop sinks aren't correctly linked. Please try re-installing/upgrading the AMS packages. Upgrade steps are available in the Ambari Upgrade guide