Support Questions

Find answers, ask questions, and share your expertise

Impala UDF C++ : if an error occurs, is CLOSE_FN always called ?

avatar
Explorer

I have the following UDF :

 

 

CREATE FUNCTION myudf(string)
RETURNS string
LOCATION '/user/cloudera/myudflib.so'
SYMBOL='Process'
PREPARE_FN='PrepareLibrariesAndDataStructures'
CLOSE_FN='CloseLibrariesAndCleanupDataStructures';

 

 

As you can see, my C++ UDF need for each Impala thread to initialize some libraries and data structures with the PrepareLibrariesAndDataStructures function BEFORE the Process function start to be called multiples times.

 

On the other hand, CloseLibrariesAndCleanupDataStructures need to always be called when the corresponding Impala thread has no other Process function to call, in order to freeup data structure and cleanup libraries.

 

In order to avoid memory leaks, does Cloudera Impala guarantee that when, either the user cancel the query,  or either the Process function fails with setError(), the CLOSE_FN will be still called ?

 

In other words, can we trust Cloudera Impala, to always call CLOSE_FN when a corresponding PREPARE_FN is called ? Or must we put the data_structures/library initialization/cleanup directly in the SYMBOL Process function to minimize the memory leaks risks ?

 

Thank you very much !

 

 

 

 

 

1 ACCEPTED SOLUTION

avatar

Yes, Close() will be called if the query fails.

 

Keep in mind that the same UDF can be in use from multiple threads at the same time, so any cleanup logic needs to be thread-safe and not clean up things that other threads might be using.

View solution in original post

2 REPLIES 2

avatar

Yes, Close() will be called if the query fails.

 

Keep in mind that the same UDF can be in use from multiple threads at the same time, so any cleanup logic needs to be thread-safe and not clean up things that other threads might be using.

avatar
Explorer
Thanks again !