Reply
New Contributor
Posts: 9
Registered: ‎07-06-2018

Way to capture Impala Pool metrics (Adminssion Control Stats) using CM API

Hi ,

 

I am in process of collecting information from Impala, I can successfully fetch all running Impala query information

using :

 

https://hostname:7183/api/v17/clusters/cluster_name/services/impala/impalaQueries

 

Questions:

1. I want to be able to fetch it for a particular time frame, how can I achieve that.

 

2. Second related question is I know to be able to do after a particular time (date) but that hangs/fails after a limit. Is there a workaround to counter that . For example: If a date in past is selected which has quite a few results it fails with below

 

{
  "queries" : [ ],
  "warnings" : [ "Impala query scan limit reached. Last end time considered is 2018-06-31T18:19:46.409Z" ]
}

 

3. I was trying to fetch all settings mentioned under Impala admission control, for ex: request_pools configured and their attributes etc.

 

But I couldn't find a direct way available out of box using CM API. Please share possible solutions.

 

Regards

Posts: 696
Topics: 1
Kudos: 164
Solutions: 88
Registered: ‎04-22-2014

Re: Way to capture Impala Pool metrics (Adminssion Control Stats) using CM API

@Prav,

 

(1)

 

You can specify from/to time frame and offsets as described here:

 

https://cloudera.github.io/cm_api/apidocs/v19/path__clusters_-clusterName-_services_-serviceName-_im...

 

You can use the filters to filter on query_state = RUNNING perhaps

 

(2)

 

Currently, any "next steps" after you get the warning needs to be done manually (there is no server feature to help).  For instance,   if you got the warning you provided in the example, you might then parse it and pass the date/time into another query that uses the date/time as the "from" value:

from=2018-06-31T18:19:46.409Z

 

(3)

 

I think what you are looking for with admission control is stored in the Impala Service configuration.  You can use this api endpoint:

https://cloudera.github.io/cm_api/apidocs/v19/path__clusters_-clusterName-_services_-serviceName-_co...

 

For example:

https://cm_host.example.com:7183/api/v16/clusters/Cluster%201/services/IMPALA-1/config

 

The above assumes:

- The cluster is named "Cluster 1"

- The Impala Service is named "IMPALA-1"

 

This may differ in your cluster.

 

The attribute that stores the pool information is "impala_scheduled_allocations" and maybe "impala_schedule_rules"

 

I am not entirely sure that is what you are looking for, but it seems it may be.

 

New Contributor
Posts: 9
Registered: ‎07-06-2018

Re: Way to capture Impala Pool metrics (Adminssion Control Stats) using CM API

@bgooley

Thanks for response:

 

For 2 :

 

Is there a particular time format that it takes as an input. For ex:

 

When I run :

 

https://hostname:7183/api/v17/clusters/cluster_name/services/impala/impalaQueries?from=2018-07-09

 

I get warning as:

 

"warnings" : [ "Impala query scan limit reached. Last end time considered is 2018-07-09T17:04:32.776Z" ]

 

Now if I pick time from above warning and use as below :


https://hostname:7183/api/v17/clusters/cluster_name/services/impala/impalaQueries?from=2018-07-09T17:04:32.776Z

 

I still get the same warning:

 

"warnings" : [ "Impala query scan limit reached. Last end time considered is 2018-07-09T17:04:32.776Z" ]

Announcements