Created 05-31-2018 03:14 PM
Hello,
I was using the CM API and I think that I reached the maximum number of requests. ¿What is the maximum of requests and how can I increase this value?
/api/v7/clusters/cluster/services/impala/impalaQueries?from=2018-05-31T0%3A0%3A0
{ "queries" : [ ], "warnings" : [ "Impala query scan limit reached. Last end time considered is 2018-05-31T16:21:46.409Z" ] }
Regards,
Joaquin
Created 07-06-2018 12:48 PM
Hi Team,
Any one has a solution to this question already? Please share if you have, thanks.
Created 07-06-2018 01:17 PM
This is an internal limit on how searches are done that restricts "scanning" across partitions (based on time).
When you see this message, generally you can take the time in the warning and then use that to formulate a new query specifying that as the start time.
Created 07-06-2018 01:22 PM
Thanks let me try that approach then.
Would you be kind enough and share some thoughts on :
Regards
Created 07-09-2018 10:37 AM
Is there a particular time format that it takes as an input. For ex:
When I run :
https://hostname:7183/api/v17/clusters/cluster_name/services/impala/impalaQueries?from=2018-07-09
I get warning as:
"warnings" : [ "Impala query scan limit reached. Last end time considered is 2018-07-09T17:04:32.776Z" ]
Now if I pick time from above warning and use as below :
https://hostname:7183/api/v17/clusters/cluster_name/services/impala/impalaQueries?from=2018-07-09T17:04:32.776Z
I still get the same warning:
"warnings" : [ "Impala query scan limit reached. Last end time considered is 2018-07-09T17:04:32.776Z" ]
Created 07-09-2018 10:57 AM
I am not an expert in these queries, but ISO 8601 format is what is used (like 2018-07-09T17:04:32.776Z).
If you have a lot of impala queries in this cluster you may need to specify a "to" time as well.
I'm not sure why, when you specified the time liste din the warning, the same warning appeared... that does seem strange.
Do you see queries listed in the CM UI? maybe try finding one query and try making the "from" and "to" in the API query encompass one or two queries that are displayed in CM.
Created 07-09-2018 11:14 AM
Thanks for quick response. I tried to include from and to in a filter but unable to work it out.Do you have an example you can share?
https://hostname:7183/api/v17/clusters/cluster_name/services/impala/impalaQueries?filter=(from=2018-07-09 and to=`date +"%Y-%m-%dT%T"`)
PS : Yes there are running queries and yes a lot of them.
Created 07-09-2018 11:50 AM
I used the same one you used:
/api/v17/clusters/Cluster 1/services/IMPALA-1/impalaQueries?from=2018-08-01T17:04:32.776Z
works fine on my small test cluster
Created 07-09-2018 11:51 AM
You may want to add DEBUG to Service Monitor and try the query again. *maybe* some clues may come to light, but I hate to say I doubt it.
Created 07-09-2018 12:59 PM
Your query has "august" mentioned, could be a typo. But what I meant is when I specify "2018-07-09T17:04:32.776Z" in the from clause I still get query results (100 - default) but they still have this warning message at the end of the page.
warnings" : [ "Impala query scan limit reached. Last end time considered is 2018-07-09T17:04:32.776Z
I was assuming this , If I capture the date in warning message at the end of each page and use it as a date field in the "from clause" the query should return the next result set of query and will have a date warning of some other time.date(future) which I can use again until I reach the current date/time at which point i'll stop the loop.
(Was fetching it via curl)
But that doesn't seem to be happening.
Other question I had what query string can I use to have to and from both in the same query string. I tried below but it didn't fetch any result:
https://hostname:7183/api/v17/clusters/cluster_name/services/impala/impalaQueries?filter=(from=2018-07-09 and to=`date +"%Y-%m-%dT%T"`)
Created on 07-09-2018 01:33 PM - edited 07-09-2018 01:38 PM
I'm using this one:
/api/v18/clusters/cluster/services/impala/impalaQueries?from=2018-05-31T0%3A0%3A0&filter=(user=userX)"
Also, I solved this issue by restarting the Monitoring Service.
Regards,
Created 07-09-2018 01:50 PM
Thanks so . but unfortunately from and to clause doesn't help either 😞
https://hostname:7183/api/v18/clusters/cluster_name/services/impala/impalaQueries?from=2018-07-09T12:59:32.776Z&to=2018-07-09T17:04:32.776Z
Returns queries and ends with warning :
warnings" : [ "Impala query scan limit reached. Last end time considered is 2018-07-09T12:59:32.776Z" ]
Now as mentioned I use this timestamp for next query
https://hostname:7183/api/v18/clusters/cluster_name/services/impala/impalaQueries?from=2018-07-09T12:59:32.776Z&to=2018-07-10
Returns queries and ends with warning:
"warnings" : [ "Impala query scan limit reached. Last end time considered is 2018-07-09T17:04:32.776Z" ]
back to square one where it keeps showing that as the final timestamp, so I can't really use this warning message time stamp and as a measure to get all queries from sometime until current date. After just one loop it hits this timestamp and enters a never ending loop.
I'm not using any other filter for ex: User because I need to fetch all the queries and feed it to a dashboard.
Created 07-09-2018 05:16 PM
I am still a tad confused about how this works, but...
- Queries are returned from most recent to least recent
- default result limit is 100
- default offset is 0
It seems that if the number of queries in that partition is greater than the "limit" value + offset, then you will get the warning.
I suggest playing a bit more with the limit and offset values.
For example:
https://hostname:7183/api/v18/clusters/cluster_name/services/impala/impalaQueries?from=2018-07-09T12:59:32.776Z&to=2018-07-09T17:04:32.776Z&limit=1000
Check the number of results returned and the warning. If you do hit a partition, use it as the "to" value for the next query.
Since the queries are listed from most recent, going back in time, the partition date will be come the "to" value once you have exhausted all results in the partition.
It is more or less this:
Return queries
If num queries == limit value
offset = limit + offset
if num_queries == 0 && warning shows "Impala query scan limit reached"
set "from" date to the value in the warning
continue querying for results as shown above until 0 queries are returned.
I ran out of time today to write this out... play a bit with limit and offset and see if it makes sense.
Let us know your progress.
Created 07-10-2018 09:28 AM
here is the observation:
https://hostname:7183/api/v17/clusters/cluster/services/impala/impalaQueries?from=2018-07-09T12:59:32.776Z&to=2018-07-10&limit=1000 , Returns 1000 queries and ends with warning timestamp --> 2018-07-10T01:16:17.434Z
Use this timestamp in next query in to clause and it goes back in time
https://hostname:7183/api/v17/clusters/cluster/services/impala/impalaQueries?from=2018-07-09T12:59:32.776Z&to=2018-07-10T01:16:17.434Z&limit=1000 , Returns 1000 queries and ends with warning timestamp --> 2018-07-09T21:26:17.434Z
use timestamp from output warning from first attempt and put it in from clause alone and it moves ahead:
https://hostname:7183/api/v17/clusters/cluster/services/impala/impalaQueries?from=2018-07-10T01:16:17.434Z&limit=1000 , Returns 1000 queries and ends with warning timestamp --> 2018-07-10T13:56:17.434Z
Use timestamp from output warning and again put it in from clause and this time it just stays at the same timestamp in warning message:
https://hostname:7183/api/v17/clusters/cluster/services/impala/impalaQueries?from=2018-07-10T13:56:17.434Z&limit=1000 , Returns 1000 queries and ends with warning timestamp --> 2018-07-10T13:56:17.434Z
Use timestamp from output warning and again put it in to clause this time and it now hits warning with empty timestamp - Assuming this to be the most recent output?
https://hostname:7183/api/v17/clusters/cluster/services/impala/impalaQueries?to=2018-07-10T13:56:17.434Z&limit=1000 , Returns 1000 queries and ends with warning timestamp --> []
Does above check out with your assumption?
Created 07-10-2018 10:51 AM
More or less, I think we are on the same page. One thing to keep in mind, too is the offset so that you can make sure you are seeing all the results in the timeperiod.
For example:
Return the first 1000 queries starting from most recent:
https://hostname:7183/api/v17/clusters/cluster/services/impala/impalaQueries?from=2018-07-09T12:59:32.776Z&to=2018-07-10&limit=1000
Retrun the next 1000 queries:
https://hostname:7183/api/v17/clusters/cluster/services/impala/impalaQueries?from=2018-07-09T12:59:32.776Z&to=2018-07-10&limit=1000&offset=1000
Return the next 1000:
https://hostname:7183/api/v17/clusters/cluster/services/impala/impalaQueries?from=2018-07-09T12:59:32.776Z&to=2018-07-10&limit=1000&offset=2000
(Keep doing this until you get 0 results)
If you get 0 results AND you also have a warning, that means you have another partition to traverse.
In that case, you would use the date/time in the warning to populate the "to" parameter in the next query (assuming the warnings show the date time 2018-07-10T01:16:17.434Z):
https://hostname:7183/api/v17/clusters/cluster/services/impala/impalaQueries?from=2018-07-09T12:59:32.776Z&to=2018-07-10T01:16:17.434Z&limit=1000
If the number of queries is equal to the limit, increment the offset to return the next 1000:
https://hostname:7183/api/v17/clusters/cluster/services/impala/impalaQueries?from=2018-07-09T12:59:32.776Z&to=2018-07-10T01:16:17.434Z&limit=1000&offset=1000
and repeat until you get 0 queries returned.
If you get another warnings date/time, replace the "to" parameter value with it and repeat.
If you get 0 results and 0 warnings, there are no more queries to retrieve.
NOTE: While you are doing all these queries, running queries may complete, so it is a good idea to specify an initial "to" date that is a little bit in the past if you want consistent results.