Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How can i generate some report with the queries that has been executed on impala?

avatar
Explorer


Hi ,

How can i generate some report with the queries that has been executed on impala?

 

On Cloudera Manager, Impala---queries--export we can see only one day, and on the web page CM drops the following message:

"More queries match your filter than can be displayed. Try narrowing your search filter or decreasing the time range over which you are searching"

 

Basically , i need some file like csv, with the users and queries executed on impala from one month ago (or more)

 

Thanks for your support

Esteban

1 ACCEPTED SOLUTION

avatar
Contributor

Hello Esteban,

 

My problem was that I couldn't get more than 100 records for export due to the CM webpage limitation with the same message you showed: "More queries match your filter than can be displayed. Try narrowing your search filter or decreasing the time range over which you are searching".

 

Below is an answer I got from Cloudera how to get more rows and for a bigger period using API.

The way I managed with it after having the output is a bit customized but pretty quick for me as I'm an Oracle admin. I created an external table in Oracle some database with the output file and then just took out the sql text from there or any other info I needed.

Here is the answer from Cloudera:

<<
Here is a little bit instruction on extracting Impala out of CM:

1. click support dropdown in top right
2. click api documentation
3. find the endpoint "impalaQueries" click this link

Use the endpoint address it specifics in the docs, add the filter information to the URL and set the limit parameter to a value greater than 100 (the default).

If it takes a very long time to dump this data you can also page through it using the 'offset' parameter, or you could use 'from' and 'to' with a small sliding window to grab for the time period you're interested in.

You can use the URL that CM directs you to when doing your export and simple add a "&limit=200" to the end of it to get more than 100
>>

It worked for me using:

...api/v7/clusters/cluster/services/impala/impalaQueries?from=2016-11-07T18%3A50%3A21.758Z&to=2016-11-07T19%3A20%3A21.758Z&limit=200&filter=


## Parsing the output using Oracle external table:
##################################################
drop table admin_task.impala;

create table admin_task.impala
(
column_text varchar2(4000)
)
organization external
(
type oracle_loader
default directory data_pump_dir
access parameters (
records delimited by newline
fields terminated by '#'
missing field values are null
)
location ('impala.txt')
);

alter table admin_task.impala reject limit unlimited;

create table admin_task.impala_text
(text varchar2(4000));

 

create or replace procedure admin_task.generate_impala_sql as

cursor c is
select * from admin_task.impala;

v_count1 number(10);
v_count2 number(10);
v_query varchar2(4000);

begin
for r in c loop
if r.column_text like '%"statement"%' then
v_count1 := instr(upper(r.column_text),'SELECT', 1, 1);
v_count2 := instr(r.column_text,'",', 1, 1) - v_count1;
v_query := substr(r.column_text, v_count1, v_count2) || ';' || chr(13) || chr(10);
insert into admin_task.impala_text values(v_query);
dbms_output.put_line(r.column_text || ' / ' || v_count1 || ' / ' || v_count2 || ' / ' || v_query);
-- dbms_output.put_line(v_query);
end if;
end loop;
commit;
end;
/

show errors procedure admin_task.generate_impala_sql;

set serveroutput on
set serveroutput on size 5000000
set line 10000
exec admin_task.generate_impala_sql;

 

Regards,

nicusan

View solution in original post

1 REPLY 1

avatar
Contributor

Hello Esteban,

 

My problem was that I couldn't get more than 100 records for export due to the CM webpage limitation with the same message you showed: "More queries match your filter than can be displayed. Try narrowing your search filter or decreasing the time range over which you are searching".

 

Below is an answer I got from Cloudera how to get more rows and for a bigger period using API.

The way I managed with it after having the output is a bit customized but pretty quick for me as I'm an Oracle admin. I created an external table in Oracle some database with the output file and then just took out the sql text from there or any other info I needed.

Here is the answer from Cloudera:

<<
Here is a little bit instruction on extracting Impala out of CM:

1. click support dropdown in top right
2. click api documentation
3. find the endpoint "impalaQueries" click this link

Use the endpoint address it specifics in the docs, add the filter information to the URL and set the limit parameter to a value greater than 100 (the default).

If it takes a very long time to dump this data you can also page through it using the 'offset' parameter, or you could use 'from' and 'to' with a small sliding window to grab for the time period you're interested in.

You can use the URL that CM directs you to when doing your export and simple add a "&limit=200" to the end of it to get more than 100
>>

It worked for me using:

...api/v7/clusters/cluster/services/impala/impalaQueries?from=2016-11-07T18%3A50%3A21.758Z&to=2016-11-07T19%3A20%3A21.758Z&limit=200&filter=


## Parsing the output using Oracle external table:
##################################################
drop table admin_task.impala;

create table admin_task.impala
(
column_text varchar2(4000)
)
organization external
(
type oracle_loader
default directory data_pump_dir
access parameters (
records delimited by newline
fields terminated by '#'
missing field values are null
)
location ('impala.txt')
);

alter table admin_task.impala reject limit unlimited;

create table admin_task.impala_text
(text varchar2(4000));

 

create or replace procedure admin_task.generate_impala_sql as

cursor c is
select * from admin_task.impala;

v_count1 number(10);
v_count2 number(10);
v_query varchar2(4000);

begin
for r in c loop
if r.column_text like '%"statement"%' then
v_count1 := instr(upper(r.column_text),'SELECT', 1, 1);
v_count2 := instr(r.column_text,'",', 1, 1) - v_count1;
v_query := substr(r.column_text, v_count1, v_count2) || ';' || chr(13) || chr(10);
insert into admin_task.impala_text values(v_query);
dbms_output.put_line(r.column_text || ' / ' || v_count1 || ' / ' || v_count2 || ' / ' || v_query);
-- dbms_output.put_line(v_query);
end if;
end loop;
commit;
end;
/

show errors procedure admin_task.generate_impala_sql;

set serveroutput on
set serveroutput on size 5000000
set line 10000
exec admin_task.generate_impala_sql;

 

Regards,

nicusan