Created on 12-23-2016 01:20 AM - edited 09-16-2022 03:52 AM
While fetching NULL record StringVal's "ptr" pointer behaves to be a dangling pointer
Create a table employee, with 1 column: Name String.
Inserted few records. Below is the details:
[quickstart.cloudera:21000] > desc employee;
Query: describe employee
+------+--------+---------+
| name | type | comment |
+------+--------+---------+
| name | string | |
+------+--------+---------+
Fetched 1 row(s) in 0.03s
[quickstart.cloudera:21000] > select * from employee order by name asc;
Query: select * from employee order by name asc
+--------+
| name |
+--------+
| |
| Dan |
| Jack |
| Jan |
| Magnus |
| Sam |
| NULL |
| NULL |
| NULL |
+--------+
Fetched 9 row(s) in 0.69s
[quickstart.cloudera:21000] >
Created a custom UDF stringnull. It would accept StringVal as input and return StringVal as output.
There is a check for NULL constraint based on StringVal's is_null property and pointer ptr.
If is_null is true and ptr is NULL, the UDF would return NULL else the valid string.
UDF Definition:
------------------------------------------------------------------------------------------------------------------
#defineMAX_STRING_SIZE 256
StringVal stringnull(
FunctionContext* context,
StringVal& sInput
)
{
char* pbReturnData =(char *) context->Allocate( MAX_STRING_SIZE );
memset( pbReturnData, NULL, MAX_STRING_SIZE );
if( NULL == sInput.ptr && sInput.is_null == 1 )
{
StringVal sResult( ( const char* )pbReturnData );
sResult.is_null = 1;
context->Free( (char *)pbReturnData );
sResult.len = sInput.len;
return sResult;
}
else
{
StringVal sResult( ( const char* )pbReturnData );
P_strncpy( ( char*)pbReturnData, MAX_STRING_SIZE, ( const char* )sInput.ptr, sInput.len);
sResult.is_null = 0;
context->Free( (char *)pbReturnData );
sResult.len = sInput.len;
return sResult;
}
}
DDL for custom UDF:-
CREATE FUNCTION stringnull(STRING) RETURNS STRING LOCATION '/opt/test.so' SYMBOL = 'stringnull';
---------------------------------------------------------------------------------------------------------------------------
Record returned when select is called with custom UDF:
[quickstart.cloudera:21000] > select stringnull(name) from employee order by name asc;
Query: select stringnull(name) from employee order by name asc
+--------------------------+
| default.stringnull(name) |
+--------------------------+
| |
| Dan |
| Jack |
| Jan |
| Magnus |
| Sam |
| Sam | ## It should have been NULL, but to contradict it displays NULL as Sam.
| Sam | ## It should have been NULL, but to contradict it displays NULL as Sam.
| Sam | ## It should have been NULL, but to contradict it displays NULL as Sam.
+--------------------------+
Fetched 9 row(s) in 0.53s
Created 12-23-2016 06:42 AM
Hi RPAT,
The values of .ptr and .len are invalid if .is_null is true. For a null string value, in some cases Impala just sets the is_null field in this case and doesn't overwrite the ptr and len fields.
You should rewrite the condition as:
if (sInput.is_null) {
... } else { ... }
This isn't explicitly documented so we should improve that: https://issues.cloudera.org/browse/IMPALA-4711
Created 12-23-2016 06:42 AM
Hi RPAT,
The values of .ptr and .len are invalid if .is_null is true. For a null string value, in some cases Impala just sets the is_null field in this case and doesn't overwrite the ptr and len fields.
You should rewrite the condition as:
if (sInput.is_null) {
... } else { ... }
This isn't explicitly documented so we should improve that: https://issues.cloudera.org/browse/IMPALA-4711
Created 12-30-2016 04:42 AM
Thanks Tim, It was really helpful.