Support Questions

Find answers, ask questions, and share your expertise

Is Impala a proces engine when I use kudu?

avatar
Explorer

I am using Impala+kudu table, I don´t know if Impala is just a interface to see that tables in hue/shell or it works as proces engine when I launch a query (select,update). ¿Is it using Impala when I launch a query? Impala+kudu allows UDF or it just works for Impala (without kudu storage) ?

 

thanks in advance

1 ACCEPTED SOLUTION

avatar

Kudu has the capability to evaluate simple filters natively, e.g. using the primary index of a table, so Impala will push such filters directly to Kudu.

 

More complex filters (e.g. those involving UDFs) are evaluated by Impala after receiving rows from Kudu.

 

Impala clearly distinguishes the filters evaluated by Kudu and those by Impala in the explain plan.

View solution in original post

4 REPLIES 4

avatar
Master Collaborator

Hi @PedroGaVal

In effect, Impala is a query engine, that you can pass the queries through it to interogate the data stored in HDFS or KUDU files.
And when you use KUDU you don't need a UDFs! because the Impala/KUDU support the UPDATE/DELETE statements.

avatar
Explorer

OK, thanks @AcharkiMed , I understand that Impala not only show kudu (as external table) but also process the data. If you create a UDF ('validateCard' as Impala function) I guess you can use it, so kudu is just a storage and it does not process nothing. Then if some data is storaged in kudu format it does not use hdfs. I'm right?

 

 

avatar
Master Collaborator

You are welcome @PedroGaVal
Yes you are absolutely right man.

avatar

Kudu has the capability to evaluate simple filters natively, e.g. using the primary index of a table, so Impala will push such filters directly to Kudu.

 

More complex filters (e.g. those involving UDFs) are evaluated by Impala after receiving rows from Kudu.

 

Impala clearly distinguishes the filters evaluated by Kudu and those by Impala in the explain plan.