Support Questions

Find answers, ask questions, and share your expertise

Impala data model

avatar
Explorer
What's the best approach to follow to structure data files for Impala tables either flat file fully denormalised into a single file vs star schema model? This use case for integration with BI tools like Microstrategy with Impala.
1 ACCEPTED SOLUTION

avatar
Explorer
I agree - just that some references were made to say flat file structures are efficient for Hadoop compare to start scheme structure interms of efficient for IO performance. But as you said it's very important to model it in a way it can work with BI tools.

View solution in original post

2 REPLIES 2

avatar

Unless you have a very strong reason to try a different approach I'd recommend the star schema model. The benefit is mostly that this model is so prevalent that I'd expect the integration with third party tools to be more smooth than with other "fancier" approaches.

 

avatar
Explorer
I agree - just that some references were made to say flat file structures are efficient for Hadoop compare to start scheme structure interms of efficient for IO performance. But as you said it's very important to model it in a way it can work with BI tools.