Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

In Impala UDF - While fetching NULL record StringVal’s “ptr” pointer behaves to be a dangling point

avatar
Explorer

While fetching NULL record StringVal's "ptr" pointer behaves to be a dangling pointer


Create a table employee, with 1 column: Name String.
Inserted few records. Below is the details:

 


[quickstart.cloudera:21000] > desc employee;
Query: describe employee
+------+--------+---------+
| name | type | comment |
+------+--------+---------+
| name | string | |
+------+--------+---------+
Fetched 1 row(s) in 0.03s
[quickstart.cloudera:21000] > select * from employee order by name asc;
Query: select * from employee order by name asc
+--------+
| name |
+--------+
| |
| Dan |
| Jack |
| Jan |
| Magnus |
| Sam |
| NULL |
| NULL |
| NULL |
+--------+
Fetched 9 row(s) in 0.69s
[quickstart.cloudera:21000] >


Created a custom UDF stringnull. It would accept StringVal as input and return StringVal as output.
There is a check for NULL constraint based on StringVal's is_null property and pointer ptr.
If is_null is true and ptr is NULL, the UDF would return NULL else the valid string.


UDF Definition:
------------------------------------------------------------------------------------------------------------------

#defineMAX_STRING_SIZE  256

 

StringVal stringnull(
FunctionContext* context,
StringVal& sInput
)
{
char* pbReturnData =(char *) context->Allocate( MAX_STRING_SIZE );
memset( pbReturnData, NULL, MAX_STRING_SIZE );

if( NULL == sInput.ptr && sInput.is_null == 1 )
{
StringVal sResult( ( const char* )pbReturnData );
sResult.is_null = 1;
context->Free( (char *)pbReturnData );
sResult.len = sInput.len;
return sResult;
}
else
{
StringVal sResult( ( const char* )pbReturnData );
P_strncpy( ( char*)pbReturnData, MAX_STRING_SIZE, ( const char* )sInput.ptr, sInput.len);
sResult.is_null = 0;
context->Free( (char *)pbReturnData );
sResult.len = sInput.len;
return sResult;
}

}

 

DDL for custom UDF:-
CREATE FUNCTION stringnull(STRING) RETURNS STRING LOCATION '/opt/test.so' SYMBOL = 'stringnull';

---------------------------------------------------------------------------------------------------------------------------
Record returned when select is called with custom UDF:

 

[quickstart.cloudera:21000] > select stringnull(name) from employee order by name asc;
Query: select stringnull(name) from employee order by name asc
+--------------------------+
| default.stringnull(name) |
+--------------------------+
| |
| Dan |
| Jack |
| Jan |
| Magnus |
| Sam |
| Sam | ## It should have been NULL, but to contradict it displays NULL as Sam.
| Sam | ## It should have been NULL, but to contradict it displays NULL as Sam.
| Sam | ## It should have been NULL, but to contradict it displays NULL as Sam.
+--------------------------+
Fetched 9 row(s) in 0.53s

 

1 ACCEPTED SOLUTION

avatar

Hi RPAT,

  The values of .ptr and .len are invalid if .is_null is true. For a null string value, in some cases Impala just sets the is_null field in this case and doesn't overwrite the ptr and len fields.

 

You should rewrite the condition as:

 

  if (sInput.is_null) {
... } else { ... }

This isn't explicitly documented so we should improve that: https://issues.cloudera.org/browse/IMPALA-4711

View solution in original post

2 REPLIES 2

avatar

Hi RPAT,

  The values of .ptr and .len are invalid if .is_null is true. For a null string value, in some cases Impala just sets the is_null field in this case and doesn't overwrite the ptr and len fields.

 

You should rewrite the condition as:

 

  if (sInput.is_null) {
... } else { ... }

This isn't explicitly documented so we should improve that: https://issues.cloudera.org/browse/IMPALA-4711

avatar
Explorer

Thanks Tim, It was really helpful.