- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Database Sizing and recommendation for Ambari and HDP components
- Labels:
-
Apache Ambari
-
Apache Hive
-
Apache Oozie
Created 12-10-2015 09:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What’s the approximate load on DB if we collocate ambari, oozie, hive, ranger admin & audit etc on the same DB cluster?
- Cluster Size ~ <100 nodes, 100-500 nodes, 500-1000 nodes and 1000+
- Number of Users for Hive and Oozie ~100+
Created 12-11-2015 07:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We had some performance issue with a low profile config (4 vCores, 8 GB RAM), expecially with Oozie. Right now we reccomend at least 4 vCores and 24 GB or RAM). If you're planning a IaaS deployment on Azure using as metadata repository SQL Azure start directly with a S2/S3 instance: if you use Oozie it's the minimum requirement.
Pay also attention to Ranger: it's OK to Ranger admin and users but for audits you need to look carefully at DB sizes: it can easly grow up faster. Use a script to truncate the table or use a different instance.
Created 12-10-2015 09:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is the host only going to contain DBs, or will it also contain Ambari Server, HiveServer, etc?
Spreading it out is wise, but if you're really constrained, the Ambari DB is usually no more than 100 MB. There's an article on how to optimize the Ambari DB for large clusters (200+ nodes) http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.1.0/bk_ambari_reference_guide/content/ch_tuning_...
Created 12-10-2015 10:22 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Alejandro FernandezThere is no guidance on Sizing database server i.e. CPUs, RAM required for Database Server for small to medium and large clusters.
Created 12-10-2015 10:00 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Pardeep It will be very difficult to forecast the load. Assumption: All connections are active them load can be low to medium
Created 12-10-2015 10:07 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The base DB's are usually small in size.The Ambari Server, HiveServer components should just be DB schemas a golden rules is always to have failover configured and tested and documented.
Don't forget non of these DB's is OLTP ,latency shouldn't be an issue so a cheap rackable 24 to 36 GB RAM and 8 CPU should be fine
Created 12-10-2015 10:20 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would be interested in planning for Hardware i.e. CPU and RAM for the database server(s). Disk requirement won't be high and not much concerning.
Created 12-11-2015 07:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We had some performance issue with a low profile config (4 vCores, 8 GB RAM), expecially with Oozie. Right now we reccomend at least 4 vCores and 24 GB or RAM). If you're planning a IaaS deployment on Azure using as metadata repository SQL Azure start directly with a S2/S3 instance: if you use Oozie it's the minimum requirement.
Pay also attention to Ranger: it's OK to Ranger admin and users but for audits you need to look carefully at DB sizes: it can easly grow up faster. Use a script to truncate the table or use a different instance.
Created 12-12-2015 12:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Andrea D'Orio Re: Ranger DB
The recommended approach is to go with Solr instead DB "for all the new deployments"
Created 12-12-2015 08:39 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just configure the Nagios and Ganglia those 2 monitoring tools should give you some good metrics
