Created 01-25-2016 06:47 PM
This link has some great write ups on Oozie Production Recommendations.
It does mention DO not use the same Hive Server MYSQL database for Oozie.
What I want to know is can I use an existing database I have in my organization, perhaps on another node outside of the cluster, for Oozie, Ranger, Hive etc, or should all the databases be located in the cluster?
What are the cons for using an existing database on a separate node, network etc. outside the cluster for these components?
Created 01-25-2016 06:53 PM
Yes, we can use database on another node outside of the cluster. We need to make sure that network connection is stable and redundant.
I am big fan of database consolidation as it makes it easy to maintain and backup.
Created 01-25-2016 06:53 PM
Yes, we can use database on another node outside of the cluster. We need to make sure that network connection is stable and redundant.
I am big fan of database consolidation as it makes it easy to maintain and backup.
Created 01-25-2016 06:54 PM
What if the Database is in a VM Image?
Created 01-26-2016 01:03 PM
Just be wary of potential load issues. We reached the connection limits of our consolidated postgresql database because all services were pointing to the same db, This essentially stopped oozie and hive randomly. The biggest culprit seems to have been ranger. If auditing to db is switched on it puts quite a load on the database.
Created 01-26-2016 03:26 PM
Vm works as long as it's tuned and has good configuration . It's better to have vm as we can increase resources as we grow
Ranger will in solr so no concerns on @Benjamin Leonhardi point "very good info "