Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Ambari Metrics Collector Start Failed on 3 Node AWS Cluster

avatar
Expert Contributor

I am on Ambari Version 2.2.0.0 and did a fresh install of hadoop.

Whenever I start my hadoop services I get the following warnings and my metrics collector is shutting down (all other hadoop services are fine):

1602-bildschirmfoto-2016-01-27-um-193216.png

I attached all the logs from the ambari-metrics-collector directory in the attached zip.

Any help is appreciated!

ambari-metrics-collector.zip

br,

Rainer

1 ACCEPTED SOLUTION

avatar

It looks like your Metrics Collector cannot start because the HBase Master is not coming up. Just to make sure you're running a non-kerberized environment with a single Namenode, right? Are you using Metrics in distributed or embedded mode?

Could you please validate and post this configuration => hbase.rootdir

Your HBase Master log files show a connection refused when the HBase Master is trying to connect to the Namenode

2016-01-27 18:36:57,189 FATAL [hdp1n3:61300.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
java.net.ConnectException: Call From hdp1n3/XXXXXXXXX to hdp1n3.aye1vpcdev:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
	at org.apache.hadoop.ipc.Client.call(Client.java:1431)
	...
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	...

Is this the hostname of your namenode "hdp1n3.aye1vpcdev:8020"?

Can you access the HDFS from the Metrics Collector node?

View solution in original post

7 REPLIES 7

avatar

I haven't looked at you log files as yet but see the following to solve common issues.

Up the heap size ams-env : metrics_collector_heapsize = 1024
Set timeline.metrics.service.default.result.limit = 15840 
Restart the Collector

https://cwiki.apache.org/confluence/display/AMBARI/Configurations+-+Tuning

https://community.hortonworks.com/articles/11805/how-to-solve-ambari-metrics-corrupted-data.html

https://community.hortonworks.com/questions/8928/ambari-metrics-1.html

avatar
Master Mentor

@Ancil McBarnett This has worked for me in the past.

avatar
Expert Contributor

Thanks Ancil, I tried this already and unfortunately it didn't work out for me ...

avatar

It looks like your Metrics Collector cannot start because the HBase Master is not coming up. Just to make sure you're running a non-kerberized environment with a single Namenode, right? Are you using Metrics in distributed or embedded mode?

Could you please validate and post this configuration => hbase.rootdir

Your HBase Master log files show a connection refused when the HBase Master is trying to connect to the Namenode

2016-01-27 18:36:57,189 FATAL [hdp1n3:61300.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
java.net.ConnectException: Call From hdp1n3/XXXXXXXXX to hdp1n3.aye1vpcdev:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
	at org.apache.hadoop.ipc.Client.call(Client.java:1431)
	...
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	...

Is this the hostname of your namenode "hdp1n3.aye1vpcdev:8020"?

Can you access the HDFS from the Metrics Collector node?

avatar
Expert Contributor

Hi Jonas, thanks, this was exactly the issue ... I switched from Embedded to Distributed Mode and by doing that, the default value for hbase.rootdir was pointing to the wrong note ... now everything is working as expected! Thanks again for your feedback, highly appreciated!

avatar

Glad it worked! Happy Hadooping 🙂

avatar
Master Mentor

@Jonas Straub Happy Hadooping!! 🙂