Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Impala cannot start SIGBUS crash

avatar

Hi, after updateing my data nodes and kernel, and restarting the cluster Impala failed to start the Daemons. I tried to restart the impala daemon, but did not helped. Also tested on CDH 5.10 and CDH 5.11.1.

Tried to install different version of Java as well, downgrade, didnt helped either.

 

Running Centos 7 and CDH 5.11.1

 

Any suggestions how to avoid this error?

OS reinstall is my last option, but I do not want to clean up the whole cluster.

 

 

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x00007fa6b9f80c18, pid=3819, tid=0x00007fa6cfdb4900
#
# JRE version:  (8.0_131-b11) (build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# j  java.lang.Object.<clinit>()V+0
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

---------------  T H R E A D  ---------------

Current thread (0x0000000004c94000):  JavaThread "Unknown thread" [_thread_in_Java, id=3819, stack(0x00007ffc40b88000,0x00007ffc40c88000)]

siginfo: si_signo: 7 (SIGBUS), si_code: 2 (BUS_ADRERR), si_addr: 0x00007ffc40c77420

Registers:
RAX=0x00007fa6b50f5a68, RBX=0x00007fa6b5047ca8, RCX=0x0000000000000008, RDX=0x00007fa6cf0fff30
RSP=0x00007ffc40c7f420, RBP=0x00007ffc40c7f460, RSI=0x0000000000000004, RDI=0x0000000004c94000
R8 =0x0000000000000000, R9 =0x0000000000000003, R10=0x0000000000000000, R11=0x0000000000000002
R12=0x0000000000000000, R13=0x00007fa6b5047c98, R14=0x00007ffc40c7f468, R15=0x0000000004c94000
RIP=0x00007fa6b9f80c18, EFLAGS=0x0000000000010202, CSGSFS=0x0000000000000033, ERR=0x0000000000000006
  TRAPNO=0x000000000000000e

Top of Stack: (sp=0x00007ffc40c7f420)
0x00007ffc40c7f420:   00007ffc40c7f420 00007fa6b5047c98
0x00007ffc40c7f430:   00007ffc40c7f468 00007fa6b50f1040
0x00007ffc40c7f440:   0000000000000000 00007fa6b5047ca8
0x00007ffc40c7f450:   0000000000000000 00007ffc40c7f470
0x00007ffc40c7f460:   00007ffc40c7f4d0 00007fa6b9f6e4e7
0x00007ffc40c7f470:   00007ffc00001fa0 0000000000000000
0x00007ffc40c7f480:   0000000004c94000 00007ffc40c7f550
0x00007ffc40c7f490:   00007fa6b5047ca8 00007ffc40c7f510
0x00007ffc40c7f4a0:   00007ffc40c7f510 00007ffc40c7f6e8
0x00007ffc40c7f4b0:   00007fa60000000a 00007fa6b5047ca8
0x00007ffc40c7f4c0:   00007fa6b9f809c0 00007ffc40c7f658
0x00007ffc40c7f4d0:   00007ffc40c7f640 00007fa6ce7cfd16
0x00007ffc40c7f4e0:   0000000000000000 0000000004c94000
0x00007ffc40c7f4f0:   00007ffc40c7f650 00007ffc40c7f6e0
0x00007ffc40c7f500:   00007fa6b9f809c0 00007fa60000000a
0x00007ffc40c7f510:   0000000004c94000 0000000004b78140
0x00007ffc40c7f520:   00007fa6b5047ca8 0000000000000000
0x00007ffc40c7f530:   0000000000000000 0000000000000000
0x00007ffc40c7f540:   0000000000000000 00007ffc40c7f6e0
0x00007ffc40c7f550:   0000000004c94000 0000000004b65b40
0x00007ffc40c7f560:   0000000004b5c5a0 0000000004b5c5c0
0x00007ffc40c7f570:   0000000004b5c688 00000000000000d8
0x00007ffc40c7f580:   00007ffc40c7f830 0000000004c94000
0x00007ffc40c7f590:   00007fa6b5047ca8 0000000004c94000
0x00007ffc40c7f5a0:   0000000004b618d0 00007fa6b5049648
0x00007ffc40c7f5b0:   00007fa6b5047ca8 0000000004c94000
0x00007ffc40c7f5c0:   00007ffc40c7f720 00007fa6ce910043
0x00007ffc40c7f5d0:   0000000004c94000 00007fa6ce9f1e67
0x00007ffc40c7f5e0:   00007fa6b5047ca8 0000000004c94000
0x00007ffc40c7f5f0:   00007ffc40c7f6d0 0000000000000000
0x00007ffc40c7f600:   00007fa6b5047ca8 0000000004c94000
0x00007ffc40c7f610:   0000000004b5c5a0 00007ffc40c7f650

Instructions: (pc=0x00007fa6b9f80c18)
0x00007fa6b9f80bf8:   00 d0 ff ff 89 84 24 00 c0 ff ff 89 84 24 00 b0
0x00007fa6b9f80c08:   ff ff 89 84 24 00 a0 ff ff 89 84 24 00 90 ff ff
0x00007fa6b9f80c18:   89 84 24 00 80 ff ff 89 84 24 00 70 ff ff 89 84
0x00007fa6b9f80c28:   24 00 60 ff ff 89 84 24 00 50 ff ff 89 84 24 00

Register to memory mapping:

RAX=0x00007fa6b50f5a68 is pointing into metadata
RBX={method} {0x00007fa6b5047ca8} '<clinit>' '()V' in 'java/lang/Object'
RCX=0x0000000000000008 is an unknown value
RDX=0x00007fa6cf0fff30: <offset 0xfc1f30> in /usr/java/jdk1.8.0_131/jre/lib/amd64/server/libjvm.so at 0x00007fa6ce13e000
RSP=0x00007ffc40c7f420 is pointing into the stack for thread: 0x0000000004c94000
RBP=0x00007ffc40c7f460 is pointing into the stack for thread: 0x0000000004c94000
RSI=0x0000000000000004 is an unknown value
RDI=0x0000000004c94000 is a thread
R8 =0x0000000000000000 is an unknown value
R9 =0x0000000000000003 is an unknown value

VM Arguments:
jvm_args: -Djava.library.path=/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/bin/../lib/impala/lib
java_command: <unknown>
java_class_path (initial): /usr/share/java/mysql-connector-java.jar:/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar:/usr/share/java/oracle-connector-java.jar:/var/lib/impala/*.jar:/usr/share/java/mysql-connector-java.jar:/run/cloudera-scm-agent/process/319-impala-IMPALAD/impala-conf:/run/cloudera-scm-agent/process/319-impala-IMPALAD/hadoop-conf:/run/cloudera-scm-agent/process/319-impala-IMPALAD/hive-conf:/run/cloudera-scm-agent/process/319-impala-IMPALAD/hbase-conf:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/libthrift-0.9.0.jar::/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/ST4-4.0.4.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/activation-1.1.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/ant-1.5.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/ant-1.9.1.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/ant-contrib-1.0b3.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/ant-launcher-1.9.1.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/antlr-2.7.7.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/antlr-runtime-3.3.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/aopalliance-1.0.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/apache-log4j-extras-1.2.17.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/apacheds-i18n-2.0.0-M15.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/api-asn1-api-1.0.0-M20.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/api-util-1.0.0-M20.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/asm-3.1.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib/asm-commons-3.1.jar:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/
Launcher Type: generic

Environment Variables:
JAVA_HOME=/usr/java/jdk1.8.0_131
JAVA_TOOL_OPTIONS=
PATH=/sbin:/usr/sbin:/bin:/usr/bin
LD_LIBRARY_PATH=/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/lib:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/impala/sbin-retail:/usr/java/jdk1.8.0_131/jre/lib/amd64:/usr/java/jdk1.8.0_131/jre/lib/amd64:/usr/java/jdk1.8.0_131/jre/lib/amd64/server:
SHELL=/bin/bash

Signal Handlers:
SIGSEGV: [libjvm.so+0xac8af0], sa_mask[0]=11111111111111111111111111111110, sa_flags=SA_ONSTACK|SA_SIGINFO
SIGBUS: [libjvm.so+0xac8af0], sa_mask[0]=11111111111111111111111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGFPE: [impalad+0x178a0e0], sa_mask[0]=00010111001000000000000000000000, sa_flags=SA_ONSTACK|SA_SIGINFO
SIGPIPE: SIG_IGN, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGXFSZ: SIG_IGN, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGILL: [impalad+0x178a0e0], sa_mask[0]=00010111001000000000000000000000, sa_flags=SA_ONSTACK|SA_SIGINFO
SIGUSR1: [impalad+0x79a640], sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGUSR2: [libjvm.so+0x923610], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO
SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none


---------------  S Y S T E M  ---------------

OS:CentOS Linux release 7.3.1611 (Core)

uname:Linux 3.10.0-514.21.2.el7.x86_64 #1 SMP Tue Jun 20 12:24:47 UTC 2017 x86_64
libc:glibc 2.17 NPTL 2.17
rlimit: STACK 8192k, CORE 0k, NPROC 65536, NOFILE 32768, AS infinity
load average:0.56 0.21 0.08

/proc/meminfo:
MemTotal:        7231176 kB
MemFree:          323736 kB
MemAvailable:    3480696 kB
Buffers:            4060 kB
Cached:          3365420 kB
SwapCached:            0 kB
Active:          3480808 kB
Inactive:        3267080 kB
Active(anon):    3379768 kB
Inactive(anon):    16648 kB
Active(file):     101040 kB
Inactive(file):  3250432 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:               224 kB
Writeback:             0 kB
AnonPages:       3378180 kB
Mapped:            48992 kB

 

1 ACCEPTED SOLUTION

avatar

Found out that it is related to this issue

https://issues.apache.org/jira/browse/DAEMON-363

 

So editing in CM the Impala Daemon properties:

Impala Daemon Environment Advanced Configuration Snippet (Safety Valve)

JAVA_TOOL_OPTIONS=-Xss2m

 

Fixed the problem.

 

View solution in original post

10 REPLIES 10

avatar

Found out that it is related to this issue

https://issues.apache.org/jira/browse/DAEMON-363

 

So editing in CM the Impala Daemon properties:

Impala Daemon Environment Advanced Configuration Snippet (Safety Valve)

JAVA_TOOL_OPTIONS=-Xss2m

 

Fixed the problem.

 

avatar

This is https://issues.apache.org/jira/browse/IMPALA-5578

 

I think you will probably also need to update "Impala Catalog Server Environment Advanced Configuration Snippet (Safety Valve)" before you restart the catalog daemon.

avatar
Champion

@Tim Armstrong 

As suggested I bumped the JAVA_TOOL_OPTIONS=-Xss2m .

do you have any rational  behind for -Xss2m ? can we increase it more does it depends on any parameter like number of quries and hits to impala daemon .

 

avatar

"-Xss1280k" seems to be sufficient. The default is 1024k I believe, and previously that was always sufficient in our testing.

 

The crash was caused by a change to the linux kernel that modified the memory layout around thread stacks. As a result with the default Java stack size, the JVM somehow ends up accessing invalid memory. Increasing the stack size mitigates this.t

avatar
Champion

Thanks for the detail information , appreciated it . 

avatar
Champion

@Tim Armstrongon more question

Could you let me know the Kernel version that it would fail (Centos / RHEL )

Since it is only being tested in testing enviroment , what should be done to the production box ?

 

avatar

I don't have a list of affected kernels, particularly since so many different kernel versions were patched. I know the problem was the initial fix for CVE-2017-1000364, so you could check to see if the kernel version has that in it.

 

I believe many Linux vendors are working on a fix for the fix, e.g.  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=865549 so this advice may become incorrect once the problem is resolved.

avatar
Expert Contributor

I am seeing similar issue with ServiceMonitor and Host monitor when using Redhat 6.8 (Santiago) 

CM/CDH is 5.11.1 

 

After adding JAVA_TOOL_OPTIONS=-Xss2m  to hostmonitor and service monitor configuration is works fine. 

Is this a known issue with Redhat 6.7 as well ? (The link you mentioned is centos and its 6.9) 

avatar
Catalog did not failed but you are right, it is just a matter of time when my kernel will be updated on master, so it would probably crash also.