Reply
New Contributor
Posts: 3
Registered: ‎10-31-2017

Question about query performance in Impala

[ Edited ]

Hi,

 

I'm loading data using Impala to Tableau using ODBC driver.

This flow takes a lot of time (around 80M records), and I'm trying to understand why.

I tried to investigate in Cloudera manager under Impala -> Queries -> Query details

There are a lot of metrics there, and I can't figure our easily where is the bottle neck. I have few questions and it's hard to determine which metric to use for each question. For example:

 

How much time it took to Impala to execute te query itself?

How much time it took to transfer the data from Imapal to Tableau?

How much time it took for Tableau to read the data?

 

Can you assist?

Your help would be appereciated.

Cloudera Employee
Posts: 73
Registered: ‎12-07-2015

Re: Question about query performance in Impala

Hi yehudaks,

 

How long does your processing step take? I.e., what do you mean by "takes a lot of time"?

 

Can you share a query profile?

 

Cheers, Lars

New Contributor
Posts: 3
Registered: ‎10-31-2017

Re: Question about query performance in Impala

The query takes around 6 hours (from start to finish in Tableau)

It took 3 hours few months ago, and the data size is almost the same. We also added resources to our cluster, so we don't expect the process to double its runtime

Cloudera Employee
Posts: 73
Registered: ‎12-07-2015

Re: Question about query performance in Impala

Can you share a query profile? That could give insights into where Impala is spending the time.

Highlighted
New Contributor
Posts: 3
Registered: ‎10-31-2017

Re: Question about query performance in Impala

Hi, it seems now that the bottle neck is not impala. I created a table with the query result and it took only few minutes. Currenlty I concetrate with checking network / Tableau configuration

 

Thanks

Announcements