Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

AMBARI METRICS restart randomly

avatar

Hi,

I built a new standalone hdp 2.3. This standalone are not yet sollicited by application but the ambari metrics service shutdown and restart for no reason. It can happen twice/three or four times a week.

This is the logs of my ambari-metrics-collector.log:

2016-03-14 01:29:07,166 ERROR org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: RECEIVED SIGNAL 15: SIGTERM 2016-03-14 01:29:07,169 INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: Closing master protocol: MasterService 2016-03-14 01:29:07,182 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:6188 2016-03-14 01:29:07,207 INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x1536be5998a0001 2016-03-14 01:29:07,208 INFO org.apache.zookeeper.ZooKeeper: Session: 0x1536be5998a0001 closed 2016-03-14 01:29:07,208 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2016-03-14 01:29:07,286 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping phoenix metrics system... 2016-03-14 01:29:07,289 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: phoenix metrics system stopped. 2016-03-14 01:29:07,289 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: phoenix metrics system shutdown complete. 2016-03-14 01:29:07,290 INFO org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl: Stopping ApplicationHistory 2016-03-14 01:29:07,290 INFO org.apache.hadoop.ipc.Server: Stopping server on 60200 2016-03-14 01:29:07,294 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2016-03-14 01:29:07,294 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 60200 2016-03-14 01:29:07,295 INFO org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: SHUTDOWN_MSG:

Do you know the reason of this restarts?

Thanks,

Gauthier

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Could be because of https://issues.apache.org/jira/browse/AMBARI-15492. This has been fixed in the next Ambari release (2.2.2).

For a workaround please try commenting these 2 properties in /etc/ambari-server/conf/ambari.properties

#recovery.enabled_components=METRICS_COLLECTOR

#recovery.type=AUTO_START

Restart Ambari Server.

View solution in original post

13 REPLIES 13

avatar
Master Mentor

@GAUTHIER CHRETIEN

There is a patch out check this jira

avatar

Thanks for your reply.

But I run ambari-metrics 2.2 version:

rpm -qa | grep ambari-metrics

ambari-metrics-monitor-2.2.0.0-1310.x86_64

ambari-metrics-collector-2.2.0.0-1310.x86_64

ambari-metrics-hadoop-sink-2.2.0.0-1310.x86_64

Gauthier

avatar
Super Collaborator

Could be memory issue.

Can you check /var/log/messages for Kernel OOM killer.

Also please give details of your environment. (OS, RAM, nodes etc)

avatar

I have not memory issue ( No OOM Killer...).

I run a centos6 standalone server with 125G RAM.

My ams parameters values:

metrics_collector_heapsize: 1024MB

hbase_master_heapsize: 2048MB

hbase_regionserver_heapsize: 2048MB

avatar
Super Collaborator

Could be because of https://issues.apache.org/jira/browse/AMBARI-15492. This has been fixed in the next Ambari release (2.2.2).

For a workaround please try commenting these 2 properties in /etc/ambari-server/conf/ambari.properties

#recovery.enabled_components=METRICS_COLLECTOR

#recovery.type=AUTO_START

Restart Ambari Server.

avatar

It does not work if i comment those lines.

My ambari server still stop and now it does not start.

avatar

Ok i restart ambari-agent too it seems good now. No alerts for 2 days.keep in touch ;).

Thanks

avatar
Contributor

Hi Gauthier,

Do you have any news about that issue? Does it work properly now?

Thanks

avatar

Hi Vincent,

It works 😉 we have no more alerts from our server.