Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

What are Best Practices for placing Databases for Oozie, Ranger, Hive for Production use

avatar

This link has some great write ups on Oozie Production Recommendations.

It does mention DO not use the same Hive Server MYSQL database for Oozie.

What I want to know is can I use an existing database I have in my organization, perhaps on another node outside of the cluster, for Oozie, Ranger, Hive etc, or should all the databases be located in the cluster?

What are the cons for using an existing database on a separate node, network etc. outside the cluster for these components?

1 ACCEPTED SOLUTION

avatar
Master Mentor
@Ancil McBarnett

Yes, we can use database on another node outside of the cluster. We need to make sure that network connection is stable and redundant.

I am big fan of database consolidation as it makes it easy to maintain and backup.

View solution in original post

4 REPLIES 4

avatar
Master Mentor
@Ancil McBarnett

Yes, we can use database on another node outside of the cluster. We need to make sure that network connection is stable and redundant.

I am big fan of database consolidation as it makes it easy to maintain and backup.

avatar

What if the Database is in a VM Image?

avatar
Master Guru

Just be wary of potential load issues. We reached the connection limits of our consolidated postgresql database because all services were pointing to the same db, This essentially stopped oozie and hive randomly. The biggest culprit seems to have been ranger. If auditing to db is switched on it puts quite a load on the database.

avatar
Master Mentor

@Ancil McBarnett

Vm works as long as it's tuned and has good configuration . It's better to have vm as we can increase resources as we grow

Ranger will in solr so no concerns on @Benjamin Leonhardi point "very good info "