I'm loading data using Impala to Tableau using ODBC driver.
This flow takes a lot of time (around 80M records), and I'm trying to understand why.
I tried to investigate in Cloudera manager under Impala -> Queries -> Query details
There are a lot of metrics there, and I can't figure our easily where is the bottle neck. I have few questions and it's hard to determine which metric to use for each question. For example:
How much time it took to Impala to execute te query itself?
How much time it took to transfer the data from Imapal to Tableau?
How much time it took for Tableau to read the data?
Can you assist?
Your help would be appereciated.
How long does your processing step take? I.e., what do you mean by "takes a lot of time"?
Can you share a query profile?
The query takes around 6 hours (from start to finish in Tableau)
It took 3 hours few months ago, and the data size is almost the same. We also added resources to our cluster, so we don't expect the process to double its runtime
Hi, it seems now that the bottle neck is not impala. I created a table with the query result and it took only few minutes. Currenlty I concetrate with checking network / Tableau configuration