Created 12-10-2015 09:55 PM
What’s the approximate load on DB if we collocate ambari, oozie, hive, ranger admin & audit etc on the same DB cluster?
- Cluster Size ~ <100 nodes, 100-500 nodes, 500-1000 nodes and 1000+
- Number of Users for Hive and Oozie ~100+
Created 12-11-2015 07:51 PM
We had some performance issue with a low profile config (4 vCores, 8 GB RAM), expecially with Oozie. Right now we reccomend at least 4 vCores and 24 GB or RAM). If you're planning a IaaS deployment on Azure using as metadata repository SQL Azure start directly with a S2/S3 instance: if you use Oozie it's the minimum requirement.
Pay also attention to Ranger: it's OK to Ranger admin and users but for audits you need to look carefully at DB sizes: it can easly grow up faster. Use a script to truncate the table or use a different instance.
Created 12-10-2015 09:59 PM
Is the host only going to contain DBs, or will it also contain Ambari Server, HiveServer, etc?
Spreading it out is wise, but if you're really constrained, the Ambari DB is usually no more than 100 MB. There's an article on how to optimize the Ambari DB for large clusters (200+ nodes) http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.1.0/bk_ambari_reference_guide/content/ch_tuning_...
Created 12-10-2015 10:22 PM
@Alejandro FernandezThere is no guidance on Sizing database server i.e. CPUs, RAM required for Database Server for small to medium and large clusters.
Created 12-10-2015 10:00 PM
@Pardeep It will be very difficult to forecast the load. Assumption: All connections are active them load can be low to medium
Created 12-10-2015 10:07 PM
The base DB's are usually small in size.The Ambari Server, HiveServer components should just be DB schemas a golden rules is always to have failover configured and tested and documented.
Don't forget non of these DB's is OLTP ,latency shouldn't be an issue so a cheap rackable 24 to 36 GB RAM and 8 CPU should be fine
Created 12-10-2015 10:20 PM
I would be interested in planning for Hardware i.e. CPU and RAM for the database server(s). Disk requirement won't be high and not much concerning.
Created 12-11-2015 07:51 PM
We had some performance issue with a low profile config (4 vCores, 8 GB RAM), expecially with Oozie. Right now we reccomend at least 4 vCores and 24 GB or RAM). If you're planning a IaaS deployment on Azure using as metadata repository SQL Azure start directly with a S2/S3 instance: if you use Oozie it's the minimum requirement.
Pay also attention to Ranger: it's OK to Ranger admin and users but for audits you need to look carefully at DB sizes: it can easly grow up faster. Use a script to truncate the table or use a different instance.
Created 12-12-2015 12:58 PM
@Andrea D'Orio Re: Ranger DB
The recommended approach is to go with Solr instead DB "for all the new deployments"
Created 12-12-2015 08:39 AM
Just configure the Nagios and Ganglia those 2 monitoring tools should give you some good metrics