Reply
Explorer
Posts: 8
Registered: ‎09-28-2016
Accepted Solution

Impalad exit when execute 'compute stats' on table whose schema is not compatible with parquet file

[ Edited ]

Recently we found lots of impala daemon exit in our cluster when executing 'compute stats' for some table. How to reproduce:

 

Impala version: 2.7.0-cdh5.10.0

 

  1. Create table col_str_int
  2. create table sample.col_str_int(
      s STRING,
      i INT
    ) stored as parquet;
    Generate Parquet File with incompatible schema
  3. create table sample.col_str_str (
      s string,
      i string
    ) stored as parquet;
    insert into sample.col_str_str values("some_str", "false");
    Copy parquet file to table
  4. hadoop fs -cp /user/hive/warehouse/sample.db/col_str_str/* /user/hive/warehouse/sample.db/col_str_int/
    Compute Stats
    refresh sample.col_str_int;
    compute stats sample.col_str_int;

 

Here is the message before impalad exit:

Wrote minidump to /data1/impala/logs/minidumps/impalad/57d8e9ec-a075-5f0c-54dbe818-09c7889c.dmp
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000000000000, pid=25400, tid=0x00007f4c2c458700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_121-b13) (build 1.8.0_121-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  0x0000000000000000
#
# Core dump written. Default location: /var/lib/impala/core or core.25400
#
# An error report file with more information is saved as:
# /var/lib/impala/hs_err_pid25400.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
"impalad.node153-84-98-jylt.qiyi.hadoop.impala.log.INFO.20180209-203116.25400" 501L, 28262C

So I want to ask:

  • Is this a know issue, has it been fixed in latest version?
  • Is there any workaround to such problem?

 

Cloudera Employee
Posts: 307
Registered: ‎10-16-2013

Re: Impalad exit when execute 'compute stats' on table whose schema is not compatible with parquet

Thanks for the report! I suspect you are hitting this issue:

https://issues.apache.org/jira/browse/IMPALA-5186

 

As a workaround you can set the following query option:

SET MT_DOP=0;

 

Please let us know if that worked or not.

 

Highlighted
Explorer
Posts: 8
Registered: ‎09-28-2016

Re: Impalad exit when execute 'compute stats' on table whose schema is not compatible with parquet

Thanks for you reply. I try to SET MT_DOP=0; before compute stats and it works! The impalad does not crash any more althougth compute stats still fail due to incompataible schema. 

Announcements