Reply
New Contributor
Posts: 4
Registered: ‎12-02-2017
Accepted Solution

Impala UDF C++ : if an error occurs, is CLOSE_FN always called ?

[ Edited ]

I have the following UDF :

 

 

CREATE FUNCTION myudf(string)
RETURNS string
LOCATION '/user/cloudera/myudflib.so'
SYMBOL='Process'
PREPARE_FN='PrepareLibrariesAndDataStructures'
CLOSE_FN='CloseLibrariesAndCleanupDataStructures';

 

 

As you can see, my C++ UDF need for each Impala thread to initialize some libraries and data structures with the PrepareLibrariesAndDataStructures function BEFORE the Process function start to be called multiples times.

 

On the other hand, CloseLibrariesAndCleanupDataStructures need to always be called when the corresponding Impala thread has no other Process function to call, in order to freeup data structure and cleanup libraries.

 

In order to avoid memory leaks, does Cloudera Impala guarantee that when, either the user cancel the query,  or either the Process function fails with setError(), the CLOSE_FN will be still called ?

 

In other words, can we trust Cloudera Impala, to always call CLOSE_FN when a corresponding PREPARE_FN is called ? Or must we put the data_structures/library initialization/cleanup directly in the SYMBOL Process function to minimize the memory leaks risks ?

 

Thank you very much !

 

 

 

 

 

Cloudera Employee
Posts: 278
Registered: ‎07-29-2015

Re: Impala UDF C++ : if an error occurs, is CLOSE_FN always called ?

Yes, Close() will be called if the query fails.

 

Keep in mind that the same UDF can be in use from multiple threads at the same time, so any cleanup logic needs to be thread-safe and not clean up things that other threads might be using.

New Contributor
Posts: 4
Registered: ‎12-02-2017

Re: Impala UDF C++ : if an error occurs, is CLOSE_FN always called ?

Thanks again !
Announcements