Created 03-15-2016 02:07 PM
Presently, we are running ambari server on ec2 instance and using rds instance i.e Mysql as database. Ambari server's disk space is 80% full now. What is the best way to take backup and ambari server up and running. Can anyone please tell me the procedure?
Created 03-15-2016 02:41 PM
Clone the vm if ambari is running in vm
If it's on baremetal then backup the database and follow this http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.1.0/bk_ambari_reference_guide/content/_back_up_c...
AMS stores the data in embeded HBASE instance. See this https://cwiki.apache.org/confluence/display/AMBARI/AMS+-+distributed+mode
You need to backup AMS
Created 03-15-2016 02:21 PM
You can take a dump of the MySQL database to a text file using the mysqldump command. You can see further info on the documentation site at https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_upgrading_Ambari/content/_perform_backup...
Created 03-15-2016 02:36 PM
Thank you.I am asking about ambari server backup, which is having metrics data from the last 2 months and captured most of the disk space. Let me know the process how to backup the metrics data and ambari server up and running without having downtime. Is there any downtime while doing the backup?
Created 03-15-2016 02:41 PM
Clone the vm if ambari is running in vm
If it's on baremetal then backup the database and follow this http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.1.0/bk_ambari_reference_guide/content/_back_up_c...
AMS stores the data in embeded HBASE instance. See this https://cwiki.apache.org/confluence/display/AMBARI/AMS+-+distributed+mode
You need to backup AMS
Created 03-15-2016 04:35 PM
you can use the ambari metrics api or go directly to the Phoenix database, AMbari metrics leverages Phoenix for it's storage.
https://cwiki.apache.org/confluence/display/AMBARI/Phoenix+Schema
Since its in Phoenix, you can use Mapreduce, Spark, Pig, JDBC to grab the data or using the REST API I mentioned above
https://cwiki.apache.org/confluence/display/AMBARI/Ambari+Metrics+API+specification
Created 03-15-2016 05:53 PM
What is the Total space you have allocated for Ambari Metrics? It is expected to be about 100Gb based on the workloads. How many nodes are in the cluster and what components run on this system? Do you need all those data?
You could also review the TTL parameters in AMS to reduce the amount of data retained over a period of time.
If you have 100's of nodes, you could also look at using HDFS for storage[ie, move from embedded to distributed mode]. This way you could move the content into HDFS.
Created 03-16-2016 02:07 PM
We are having a small cluster with 30 nodes, in which 18 are data nodes. We assigned 15GB for ambari-server. we deployed our packages through slider in hadoop cluster as a long running jobs. We are not using any ecosystems. Thank you. For how many days, we need to store the AMS data? Is there any specific calculation for that?
Created 03-16-2016 10:38 PM
It is upto your cluster planning and how old data you need.