Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

Solved Go to solution
Highlighted

After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

Expert Contributor

Hi,

After I upgraded the cluster successfully to the last releases CM 5.14.0 / CDH 5.14.2, I have been faced to this problem in 6 of my nodes, suddenly in the first queries the impala deamon get stopped and the query cancelled and give the error messages below:

Impala-shell:

Cancelled due to unreachable impalad(s): node1.example.com:22000

ODBC:

Status: RPC Error: Client for node5.example.com:22000 hit an unexpected exception: Unknown: Interrupted system call, type: N6apache6thrift9transport19TTransportExceptionE, rpc: N6impala19TTransmitDataResultE, send: done

Impala Deamon log file:

 

CancelQueryFInstances query_id= 3423055f3fda78a:a2446bea00000000 failed to connect to node2.example.com:22000 :Couldn't open transport for node2.example.com:22000 (connect() failed: Connection refused)

Statestore log file:

I0413 20:07:01.767758 64122 statestore.cc:729] Unable to send heartbeat message to subscriber impalad@node5.exaple.com:22000, received error: Couldn't open transport for node5.exaple.com:23000 (connect() failed: Connection refused)


When I looking for the issue source I have found this crash message in the Impala Daemon logs:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x0000000000d863e5, pid=13065, tid=0x00007efc499cf700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_144-b01) (build 1.8.0_144-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.144-b01 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [impalad+0x9863e5]  impala::HdfsScanNodeBase::StopAndFinalizeCounters()+0x965
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /var/run/cloudera-scm-agent/process/13339-impala-IMPALAD/hs_err_pid13065.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#


We have Centos OS v6.9 in the 6 servers, I tried to upgrade/downgrade to a several centos 6.9 kernel releases and jdk versions but no result, Here is the releases used:

Centos 6.9 kernel:
2.6.32-696.23.1.el6.x86_64
2.6.32-696.16.1.el6.x86_64
2.6.32-696.13.2.el6.x86_64
2.6.32-642.15.1.el6.x86_64
2.6.32-642.11.1.el6.x86_64

JDK:
jdk.1.8.0_144
jdk.1.8.0_121


Remark: The 6 nodes are the only nodes that does not support SSE4_2.

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

Master Collaborator

I expect it will be included in the 5.14.4 maintenance release. I'm not aware of a workaround aside from avoiding running on affected hardware without popcnt support.

9 REPLIES 9

Re: After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

Master Collaborator

What version of CDH were you running before the upgrade? Were you running on the same hardware?

 

Can you include the CPU info from your impalad.INFO log. It looks something like this:

I0417 17:05:31.064653  8873 init.cc:237] Cpu Info:
  Model: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
  Cores: 8
  Max Possible Cores: 8
  L1 Cache: 32.00 KB (Line: 64.00 B)
  L2 Cache: 256.00 KB (Line: 64.00 B)
  L3 Cache: 8.00 MB (Line: 64.00 B)
  Hardware Supports:
    ssse3
    sse4_1
    sse4_2
    popcnt
    avx
    avx2
    pclmulqdq
  Numa Nodes: 1
  Numa Nodes of Cores: 0->0 | 1->0 | 2->0 | 3->0 | 4->0 | 5->0 | 6->0 | 7->0 |

Re: After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

Expert Contributor
Thanks for the reply Tim
It was CDH 5.12.0 and it was working great on the same servers..
I'll share the CPU info of those nodes ASAS.

Re: After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

Expert Contributor

Hi @Tim Armstrong

Here is the CPU info from impalad.INFO :

I0417 20:54:12.845438 13375 init.cc:230] Cpu Info:
  Model: Intel(R) Xeon(R) CPU           E5405  @ 2.00GHz
  Cores: 8
  Max Possible Cores: 8
  L1 Cache: 32.00 KB (Line: 64.00 B)
  L2 Cache: 6.00 MB (Line: 64.00 B)
  L3 Cache: 0 (Line: 0)
  Hardware Supports:
    ssse3
    sse4_1
  Numa Nodes: 1
  Numa Nodes of Cores: 0->0 | 1->0 | 2->0 | 3->0 | 4->0 | 5->0 | 6->0 | 7->0 |




Re: After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

Master Collaborator

Do you have the JVM error dump file?

/var/run/cloudera-scm-agent/process/13339-impala-IMPALAD/hs_err_pid13065.log

 

I filed https://issues.apache.org/jira/browse/IMPALA-6882 to investigate the issue. I took a look at the code and it doesn't look like anything has changed, so probabyl requires deeper investigation.

Re: After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

Expert Contributor

Hi @Tim Armstrong

Thank you for you interaction.

Here is the JVM error dump file: https://ufile.io/j0zat
I have formatted 2 servers and resit them to the centos 6.9 (kernel 2.6.32-696.23.1.el6.x86_64) but always the same problem!


I hope we can resolve this bug asap, good luck.

Re: After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

New Contributor

Hello,

 

I am running into the same problem on a fresh install of CDH 5.14.3.  According to the ticket that Tim pasted above, the issue is fixed.  Is there a timeline for when this fix will be available for general release?  Is there a workaround for this that one can utilize now? 

Re: After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

Master Collaborator

I expect it will be included in the 5.14.4 maintenance release. I'm not aware of a workaround aside from avoiding running on affected hardware without popcnt support.

Re: After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

New Contributor

Hi,

 

I am happy to state that after updating to CDH 5.14.4, that this crash bug seems to be fixed.  We can run Impala queries now!  This is the first we've used Impala and it looks amazingly fast - glad we can use it now :)  Thank you for fixing!

Re: After upgrading to cdh 5.14.2 Impala daemon stopped suddenly! -

Master Collaborator

@AntonyNthanks for following up - glad to hear it!

Don't have an account?
Coming from Hortonworks? Activate your account here