Reply
New Contributor
Posts: 4
Registered: ‎06-09-2016

Impala-kudu taking a long time before starting the query

Hello all,

 

We deployed kudu in our cluster and we are facing some problems when running queries in the impala-shell, and i was wondering if it's related to our configuration or to the impala we deployed.

 

Here is the behavior we noticed:

* Sometimes the query stuck for a long time without running and we can not see it in the cloudera manager.

Sometimes the query stuck for a long time then run: 

Query: select count(*) from staging_dev.stg_weblogs_usage_kudu_rt
+--------------+
| count(*)     |
+--------------+
| 50213548  |
+--------------+
Fetched 1 row(s) in 955.72s

 

Query Timeline:

  • Query submitted: 0ns (0ns)
  • Planning finished: 15.9m (15.9m)
  • Submit for admission: 15.9m (380ms)
  • Completed admission: 15.9m (0ns)
  • Ready to start 9 fragment instances: 15.9m (0ns)
  • All 9 fragment instances started: 15.9m (112ms)
  • Rows available: 15.9m (304ms)
  • First row fetched: 15.9m (100ms)
  • Unregister query: 15.9m (8ms)

 

* Sometimes it takes few seconds but in cloudera manager it shows less than that:

Query: select count(*) from staging_dev.stg_weblogs_usage_kudu_rt
+--------------+
| count(*)     |
+--------------+
| 50221574  |
+--------------+
Fetched 1 row(s) in 1.74s

 

Query Timeline:

  • Query submitted: 0ns (0ns)
  • Planning finished: 84ms (84ms)
  • Submit for admission: 84ms (0ns)
  • Completed admission: 84ms (0ns)
  • Ready to start 9 fragment instances: 88ms (4ms)
  • All 9 fragment instances started: 88ms (0ns)
  • Rows available: 288ms (200ms)
  • First row fetched: 308ms (20ms)
  • Unregister query: 312ms (4ms)

 

Our configuration:

* CM: 5.7.1

* CDH: 5.7.1

* Kerberos deployed

* Sentry deployed

* Kudu: 1.1.0

* Impala-kudu: 2.7.0

 

Thank you !!

Cloudera Employee
Posts: 40
Registered: ‎09-28-2015

Re: Impala-kudu taking a long time before starting the query

Can you share the specific output of 'select version()' from Impala?

Also, how many tablets/buckets does the table have?
New Contributor
Posts: 4
Registered: ‎06-09-2016

Re: Impala-kudu taking a long time before starting the query

Hi Todd,

 

Thanks for your answer.

 

Here is the impala version we use:

 

Query: select version()
+------------------------------------------------------------------------------------------------------------------------------------------------+
| version()                                                                                                                                                               |
+------------------------------------------------------------------------------------------------------------------------------------------------+
| impalad version 2.7.0-IMPALA_KUDU-cdh5 DEBUG (build 10d4ebec3c23961218e972e74e9d342ffc417af1) |
| Built on Mon Nov 21 23:11:10 PST 2016                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------+

 

And the table has 9 buckets/tablets and we also have 9 tablet servers.

 

Thank you !

Highlighted
New Contributor
Posts: 4
Registered: ‎06-09-2016

Re: Impala-kudu taking a long time before starting the query

Hello,

 

I analyzed the logs more in details, and i found that the it's related to the Active Directory groups fetch.

 

I was using the ldap implementation (org.apache.hadoop.security.LdapGroupsMapping) which is not the optimal one (cloudera recommends to avoid it).

 

I changed to the other implementation (org.apache.hadoop.security.ShellBasedUnixGroupsMapping) and now it's working much better.

 

Also i hadn't a problem using the (LdapGroupsMapping) with the live impala.

 

Thanks.

Announcements