Support Questions

Suresh12 · ‎03-09-2016

What's the best approach to follow to structure data files for Impala tables either flat file fully denormalised into a single file vs star schema model? This use case for integration with BI tools like Microstrategy with Impala.

Suresh12 · ‎03-10-2016

I agree - just that some references were made to say flat file structures are efficient for Hadoop compare to start scheme structure interms of efficient for IO performance. But as you said it's very important to model it in a way it can work with BI tools.

View solution in original post

alex.behm · ‎03-09-2016

Unless you have a very strong reason to try a different approach I'd recommend the star schema model. The benefit is mostly that this model is so prevalent that I'd expect the integration with third party tools to be more smooth than with other "fancier" approaches.

Suresh12 · ‎03-10-2016

I agree - just that some references were made to say flat file structures are efficient for Hadoop compare to start scheme structure interms of efficient for IO performance. But as you said it's very important to model it in a way it can work with BI tools.

Cloudera Community

Support Questions

Impala data model