Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Solved
Go to solution
Impala data model
Labels:
- Labels:
-
Apache Impala
Explorer
Created on ‎03-09-2016 10:06 AM - edited ‎09-16-2022 03:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What's the best approach to follow to structure data files for Impala tables either flat file fully denormalised into a single file vs star schema model? This use case for integration with BI tools like Microstrategy with Impala.
1 ACCEPTED SOLUTION
Explorer
Created ‎03-10-2016 03:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I agree - just that some references were made to say flat file structures are efficient for Hadoop compare to start scheme structure interms of efficient for IO performance. But as you said it's very important to model it in a way it can work with BI tools.
2 REPLIES 2
Guru
Created ‎03-09-2016 03:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unless you have a very strong reason to try a different approach I'd recommend the star schema model. The benefit is mostly that this model is so prevalent that I'd expect the integration with third party tools to be more smooth than with other "fancier" approaches.
Explorer
Created ‎03-10-2016 03:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I agree - just that some references were made to say flat file structures are efficient for Hadoop compare to start scheme structure interms of efficient for IO performance. But as you said it's very important to model it in a way it can work with BI tools.
