10-07-2016 08:10 AM - edited 10-07-2016 08:13 AM
How can i generate some report with the queries that has been executed on impala?
On Cloudera Manager, Impala---queries--export we can see only one day, and on the web page CM drops the following message:
"More queries match your filter than can be displayed. Try narrowing your search filter or decreasing the time range over which you are searching"
Basically , i need some file like csv, with the users and queries executed on impala from one month ago (or more)
Thanks for your support
11-07-2016 11:41 AM
My problem was that I couldn't get more than 100 records for export due to the CM webpage limitation with the same message you showed: "More queries match your filter than can be displayed. Try narrowing your search filter or decreasing the time range over which you are searching".
Below is an answer I got from Cloudera how to get more rows and for a bigger period using API.
The way I managed with it after having the output is a bit customized but pretty quick for me as I'm an Oracle admin. I created an external table in Oracle some database with the output file and then just took out the sql text from there or any other info I needed.
Here is the answer from Cloudera:
Here is a little bit instruction on extracting Impala out of CM:
1. click support dropdown in top right
2. click api documentation
3. find the endpoint "impalaQueries" click this link
Use the endpoint address it specifics in the docs, add the filter information to the URL and set the limit parameter to a value greater than 100 (the default).
If it takes a very long time to dump this data you can also page through it using the 'offset' parameter, or you could use 'from' and 'to' with a small sliding window to grab for the time period you're interested in.
You can use the URL that CM directs you to when doing your export and simple add a "&limit=200" to the end of it to get more than 100
It worked for me using:
## Parsing the output using Oracle external table:
drop table admin_task.impala;
create table admin_task.impala
default directory data_pump_dir
access parameters (
records delimited by newline
fields terminated by '#'
missing field values are null
alter table admin_task.impala reject limit unlimited;
create table admin_task.impala_text
create or replace procedure admin_task.generate_impala_sql as
cursor c is
select * from admin_task.impala;
for r in c loop
if r.column_text like '%"statement"%' then
v_count1 := instr(upper(r.column_text),'SELECT', 1, 1);
v_count2 := instr(r.column_text,'",', 1, 1) - v_count1;
v_query := substr(r.column_text, v_count1, v_count2) || ';' || chr(13) || chr(10);
insert into admin_task.impala_text values(v_query);
dbms_output.put_line(r.column_text || ' / ' || v_count1 || ' / ' || v_count2 || ' / ' || v_query);
show errors procedure admin_task.generate_impala_sql;
set serveroutput on
set serveroutput on size 5000000
set line 10000