Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Database modeling, software development, and version control

avatar
Rising Star

Hi guys,

I am looking for a best practice using Hive especially for database modeling, software development and if possible a version control. At the moment I struggle at a certain point where the logical world meets the code.

I have found tools which assist modelling databases in Hive, e.g. Embaracadero (Hortonworks Partner?). So I could model my databases there and create DDL scripts, I guess. To get version control I can add those scripts to git or something else. What happens if many users want to work on the logical model? How do you handle such problems? Jumping back and forth between database model versions is only possible with git versioning not only by the tool, or is it?

All other scripts regarding the hive databases and tables (ingestion and so on) live in a git repository. So they are under perfect version control but if something changes many adaptions have to be made (at least one in the config file and maybe in Insert statements etc.). What I am missing in the code world is a nice view of databases and tablss even a entity-relation-diagramm but that's not of main interest.

What is in your opinion a good way to tackle these problems? I mean someone like Facebook does not want to manage all tables and databases via a Hive View or solely based on code, or do they? How to keep the oversight in the big data world?

Any help is really appreciated! Thanks in advance!

Kind regards,

Ken

1 ACCEPTED SOLUTION

avatar
Master Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
2 REPLIES 2

avatar
Master Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Rising Star

Hi Sunile,

Thanks for your answer. We think we will store our initial model and then all alter scripts. But all alter scripts will be included in the initial model in cas a complete re-deployment is wanted. To view a logical model a tool will be used which can reverse engineer the ddl. We try to establish a workflow in that fashion and hope that works.