Reply
Highlighted
Explorer
Posts: 9
Registered: ‎12-22-2016

Impala daemon getting crashed using UDF.

[ Edited ]

Hi,

 

Below is the UDF:-

 

StringVal my_udf(

FunctionContext* context,
StringVal& sInput
)
{
sInput.ptr[sInput.len]='\0';
if( sInput.len == 2 )
{
context->SetError( "Got Error " );
return "";
}
StringVal iResult(sInput);
return iResult;
}

 

DDL:-
CREATE FUNCTION my_udf(STRING) RETURNS STRING LOCATION '/opt/impala/udfs/testudf.so' SYMBOL = 'my_udf' ;

 

Below is the output:-
[node3:21000] > select my_udf('SS');
Query: select my_udf('SS')
Query submitted at: 2018-03-13 06:08:07 (Coordinator: http://node3.localdomain:25000)
Query progress can be monitored at: http://node3.localdomain:25000/query_plan?query_id=474fbb32b544b11d:945f6c1000000000
ERROR: UDF ERROR: Got Error

[node3:21000] > select my_udf('SS') from TBL1;
Query: select my_udf('SS') from TBL1
Query submitted at: 2018-03-13 06:08:19 (Coordinator: http://node3.localdomain:25000)
Query progress can be monitored at: http://node3.localdomain:25000/query_plan?query_id=c54c0178d9aa1dc4:de4696d800000000
Error communicating with impalad: TSocket read 0 bytes
[Not connected] >

 

Whenever "context->SetError( "Got Error " );" statement being executed, impala retries the query which is expected behavior. But in second retries, it crashes the Impala daemon on line  sInput.ptr[sInput.len]='\0'; and in /var/log/impalad/impalad.INFO
It shows SIGSEGV core dump .


# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f95bec20608, pid=13901, tid=140281109161728
#
# JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 1.7.0_67-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [pepimpala23_RHEL.13901.0.so+0x3a608] my_udf+0x25
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /run/cloudera-scm-agent/process/2875-impala-IMPALAD/hs_err_pid13901.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.sun.com/bugreport/crash.jsp
#
eued=200 max_mem=-1.00 B


Note:- core dump only occur when we call this udf (my_udf) using table and instead of passing the column name, We pass static value.

Cloudera Employee
Posts: 332
Registered: ‎07-29-2015

Re: Impala daemon getting crashed using UDF.

The below line is invalid for two reasons:


  sInput.ptr[sInput.len]='\0';

 

1. In general, it's invalid to modify input strings to the UDF.

2. sInput.ptr[sInput.len] is one byte past the end of the string's memory, so you're overwriting some unrelated memory.

Explorer
Posts: 9
Registered: ‎12-22-2016

Re: Impala daemon getting crashed using UDF.

[ Edited ]

Thanks, Tim for the quick reply.

 

But I have a doubt that why core dump is occurring only in a specific scenario.

 

scenario 1:- 

        select my_udf('SS'); // It run without core dump.

scenario 2:- 

        select my_udf(c1) from testtable; // where C1 is the column name & It runs without core dump.

scenario 3:- 

        select my_udf('AB') from testtable; // Here we are not passing column name we are passing static value to UDF & it causes core dump.

As per your comment, core dump should occur in all scenarios. But only in the scenario 3 causing the core dump. 

Cloudera Employee
Posts: 332
Registered: ‎07-29-2015

Re: Impala daemon getting crashed using UDF.

[ Edited ]

I don't see why a core dump should always occur. Sometimes you might get lucky and modify memory that is not in use. Or you might get really unlucky and modify memory that  doesn't cause an immediate crash but causes more subtle problems down the line.

Announcements