10-31-2017 04:07 AM - last edited on 10-31-2017 04:57 AM by cjervis
I'm loading data using Impala to Tableau using ODBC driver.
This flow takes a lot of time (around 80M records), and I'm trying to understand why.
I tried to investigate in Cloudera manager under Impala -> Queries -> Query details
There are a lot of metrics there, and I can't figure our easily where is the bottle neck. I have few questions and it's hard to determine which metric to use for each question. For example:
How much time it took to Impala to execute te query itself?
How much time it took to transfer the data from Imapal to Tableau?
How much time it took for Tableau to read the data?
Can you assist?
Your help would be appereciated.
11-02-2017 02:08 AM
The query takes around 6 hours (from start to finish in Tableau)
It took 3 hours few months ago, and the data size is almost the same. We also added resources to our cluster, so we don't expect the process to double its runtime
11-08-2017 12:42 AM
Hi, it seems now that the bottle neck is not impala. I created a table with the query result and it took only few minutes. Currenlty I concetrate with checking network / Tableau configuration