Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Impala UDF C++ thread-safety

Highlighted

Impala UDF C++ thread-safety

New Contributor

The Impala user's guide states "UDFs are parallelized using multiple threads". Does this mean that the UDF's functions must be thread safe? Or will only one thread be calling the UDF's functions at any given time?

 

I'm writing a UDF aggregate function that maintains state across function calls so I need to know how to write it to work with Impala.

 

Any insight into how UDF's behave in a multi-threaded environment is greatly appreciated.

 

I'm using version 5.8 of the Impala UDF development kit to avoid the linking problem with std c++11 and noexcept (which is still present in version 5.10).

 

Thanks!

1 REPLY 1
Highlighted

Re: Impala UDF C++ thread-safety

Master Collaborator

You should assume that the functions can be called from multiple threads concurrently.

 

The UDF interface (see the udf.h header installed by the impala-udf-dev package) provides GetFunctionState() and SetFunctionState() methods that. If you use those in THREAD_LOCAL mode, you can save state per-thread.

Don't have an account?
Coming from Hortonworks? Activate your account here