we run a cluster with 2 namenodes in HA mode and some datanodes and also HBase with 2 HBase master and some regionserver installed on the datanodes.
On a separate master host I have the Ambari Metrics Collector running in distributed mode.
timeline.metrics.service.operation.mode = distributed hbase.cluster.distributed = true hbase.rootdir = hdfs://cluster/apps/ams/metrics
I only have "HBase Client" and "HDFS Client" component installed on that node.
Ambari gives me warning that I should install datanode components on that node:
#"It's recommended to install Datanode component on host.domain.tld to speed up IO operations between HDFS and Metrics Collector in distributed mode"#
On https://cwiki.apache.org/confluence/display/AMBARI/AMS+-+distributed+mode I found the following statement:
#"Note: Make sure there is a local Datanode hosted with the Collector, it provides AMS HBase the distinct advantage of write and reads sharded across the data volumes available to the DN."#
What does that mean? Should I install a datanode without any data disk configured on the node running Ambari Metrics Collector, or should I move the "Ambari Metrics Collector" on an existing datanode?
I there a best practice how to distribute the services on the cluster hosts?
... View more