12-02-2017 11:18 AM - edited 12-02-2017 11:29 AM
I have the following UDF :
CREATE FUNCTION myudf(string)
As you can see, my C++ UDF need for each Impala thread to initialize some libraries and data structures with the PrepareLibrariesAndDataStructures function BEFORE the Process function start to be called multiples times.
On the other hand, CloseLibrariesAndCleanupDataStructures need to always be called when the corresponding Impala thread has no other Process function to call, in order to freeup data structure and cleanup libraries.
In order to avoid memory leaks, does Cloudera Impala guarantee that when, either the user cancel the query, or either the Process function fails with setError(), the CLOSE_FN will be still called ?
In other words, can we trust Cloudera Impala, to always call CLOSE_FN when a corresponding PREPARE_FN is called ? Or must we put the data_structures/library initialization/cleanup directly in the SYMBOL Process function to minimize the memory leaks risks ?
Thank you very much !
12-04-2017 05:02 PM
Yes, Close() will be called if the query fails.
Keep in mind that the same UDF can be in use from multiple threads at the same time, so any cleanup logic needs to be thread-safe and not clean up things that other threads might be using.