Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

In Impala UDF - While fetching NULL record StringVal’s “ptr” pointer behaves to be a dangling point

avatar
Explorer

While fetching NULL record StringVal's "ptr" pointer behaves to be a dangling pointer


Create a table employee, with 1 column: Name String.
Inserted few records. Below is the details:

 


[quickstart.cloudera:21000] > desc employee;
Query: describe employee
+------+--------+---------+
| name | type | comment |
+------+--------+---------+
| name | string | |
+------+--------+---------+
Fetched 1 row(s) in 0.03s
[quickstart.cloudera:21000] > select * from employee order by name asc;
Query: select * from employee order by name asc
+--------+
| name |
+--------+
| |
| Dan |
| Jack |
| Jan |
| Magnus |
| Sam |
| NULL |
| NULL |
| NULL |
+--------+
Fetched 9 row(s) in 0.69s
[quickstart.cloudera:21000] >


Created a custom UDF stringnull. It would accept StringVal as input and return StringVal as output.
There is a check for NULL constraint based on StringVal's is_null property and pointer ptr.
If is_null is true and ptr is NULL, the UDF would return NULL else the valid string.


UDF Definition:
------------------------------------------------------------------------------------------------------------------

#defineMAX_STRING_SIZE  256

 

StringVal stringnull(
FunctionContext* context,
StringVal& sInput
)
{
char* pbReturnData =(char *) context->Allocate( MAX_STRING_SIZE );
memset( pbReturnData, NULL, MAX_STRING_SIZE );

if( NULL == sInput.ptr && sInput.is_null == 1 )
{
StringVal sResult( ( const char* )pbReturnData );
sResult.is_null = 1;
context->Free( (char *)pbReturnData );
sResult.len = sInput.len;
return sResult;
}
else
{
StringVal sResult( ( const char* )pbReturnData );
P_strncpy( ( char*)pbReturnData, MAX_STRING_SIZE, ( const char* )sInput.ptr, sInput.len);
sResult.is_null = 0;
context->Free( (char *)pbReturnData );
sResult.len = sInput.len;
return sResult;
}

}

 

DDL for custom UDF:-
CREATE FUNCTION stringnull(STRING) RETURNS STRING LOCATION '/opt/test.so' SYMBOL = 'stringnull';

---------------------------------------------------------------------------------------------------------------------------
Record returned when select is called with custom UDF:

 

[quickstart.cloudera:21000] > select stringnull(name) from employee order by name asc;
Query: select stringnull(name) from employee order by name asc
+--------------------------+
| default.stringnull(name) |
+--------------------------+
| |
| Dan |
| Jack |
| Jan |
| Magnus |
| Sam |
| Sam | ## It should have been NULL, but to contradict it displays NULL as Sam.
| Sam | ## It should have been NULL, but to contradict it displays NULL as Sam.
| Sam | ## It should have been NULL, but to contradict it displays NULL as Sam.
+--------------------------+
Fetched 9 row(s) in 0.53s

 

1 ACCEPTED SOLUTION

avatar

Hi RPAT,

  The values of .ptr and .len are invalid if .is_null is true. For a null string value, in some cases Impala just sets the is_null field in this case and doesn't overwrite the ptr and len fields.

 

You should rewrite the condition as:

 

  if (sInput.is_null) {
... } else { ... }

This isn't explicitly documented so we should improve that: https://issues.cloudera.org/browse/IMPALA-4711

View solution in original post

2 REPLIES 2

avatar

Hi RPAT,

  The values of .ptr and .len are invalid if .is_null is true. For a null string value, in some cases Impala just sets the is_null field in this case and doesn't overwrite the ptr and len fields.

 

You should rewrite the condition as:

 

  if (sInput.is_null) {
... } else { ... }

This isn't explicitly documented so we should improve that: https://issues.cloudera.org/browse/IMPALA-4711

avatar
Explorer

Thanks Tim, It was really helpful.