Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

What are Best Practices for placing Databases for Oozie, Ranger, Hive for Production use

This link has some great write ups on Oozie Production Recommendations.

It does mention DO not use the same Hive Server MYSQL database for Oozie.

What I want to know is can I use an existing database I have in my organization, perhaps on another node outside of the cluster, for Oozie, Ranger, Hive etc, or should all the databases be located in the cluster?

What are the cons for using an existing database on a separate node, network etc. outside the cluster for these components?

1 ACCEPTED SOLUTION

@Ancil McBarnett

Yes, we can use database on another node outside of the cluster. We need to make sure that network connection is stable and redundant.

I am big fan of database consolidation as it makes it easy to maintain and backup.

View solution in original post

4 REPLIES 4

@Ancil McBarnett

Yes, we can use database on another node outside of the cluster. We need to make sure that network connection is stable and redundant.

I am big fan of database consolidation as it makes it easy to maintain and backup.

What if the Database is in a VM Image?

Just be wary of potential load issues. We reached the connection limits of our consolidated postgresql database because all services were pointing to the same db, This essentially stopped oozie and hive randomly. The biggest culprit seems to have been ranger. If auditing to db is switched on it puts quite a load on the database.

@Ancil McBarnett

Vm works as long as it's tuned and has good configuration . It's better to have vm as we can increase resources as we grow

Ranger will in solr so no concerns on @Benjamin Leonhardi point "very good info "