We are working on a project where Impala is backend database and MSTR is the reporting solution. Both are in same VPC on AWS (same subnet also ). We are using private IPs in MSTR DSN while connecting to DB.
While executing MSTR Adhoc reports on Impala 2.2 , which runs on CDH 5.3 we have noticed that reports are running long, compared to HANA. When we dig deep we have noticed that while “Query Execution Time” on both the databases is same there is significant difference during “Data Fetching and Processing Time” (Please note that underlying data is almost the same in both databases ).
BD (Impala) (in seconds)
HANA (in seconds)
Data Fetching & Processing
We have checked whether this could be because of network latency by transferring files ( files that contains same record count as MSTR report) from Impala to MSTR server. The transfer has happened <10 secs hence we ruled out this.
We want to check if protocol used in drivers to connect MSTR with the underlying DB might be different and want to check whether that is the case and if this is impacting the “Data Fetching & Processing “ time.
We are using “MicroStrategy ODBC Driver for Impala Wire Protocol” driver currently. Earlier we were using “Cloudera ODBC Driver for Impala” driver but noticed that it has some limitations and move to the current one. Currently we were only able to notice these 2 certified drivers on MSTR site (http://www.microstrategy.com/us/services-support/support/drivers )
Could you please let us know your thoughts.
We too faced similar issues with MSTR 9. After PoCs with both versions MSTR 9 and MSTR 10 , we realised the issues are taken care in Microstrategy10. Big Data Engine that comes with MSTR 10 caters to a lot more data. Cube creation issues are also resolved. MSTR 9 is not really "big data" enabled.