Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Impala Queries which were previously working are now throwing "IllegalStateException: null"

avatar
Contributor

Hello Team,

 

We are using Impala 3.2.0 there are some queries which were working 3-4 weeks back and now they have suddenly started throwing "IllegalStateException: null" exception. In last 3-4 nothing has changed on our Impala cluster. Error logs also does not suggest the cause of the error. 

 

Can anybody help us as to why this issue arising.

Query which failing looks something like this

 

WITH banner_test AS (WITH layout_id AS
            (
            SELECT <some columns>
            FROM table_v1
            WHERE
            ( layout_component_id  = <some_id>)
            ),
            layouts AS
              (SELECT <some columns>,
                      coalesce() AS col1,
                      <some columns>
               FROM table_v2 lch
               WHERE lch.some_id IN
                   (SELECT some_id
                    FROM table_v3
                    WHERE some_field_v1 IN
                        (SELECT some_field from layout_id))
                      AND lch.start_time >= '2019-01-01T00:00:00.000+00:00'
                ),
        
                 full_data AS
              (SELECT lctd.some_field_v1,
                      l.*,
                      lctd.some_field_v2,
                      min(l.start_time)over(PARTITION BY some_field_v1) AS min_start_time,
                      max(l.end_time)over(PARTITION BY some_field_v1) AS max_end_time
               FROM tables_v4 l
               INNER JOIN tables_v5 lctd ON l.some_id = lctd.some_id),
               tables_v5 AS
               (
               SELECT <some_fields>,
                  case when some_field like 'val1%' then 'val1' else 'val2' end as business
                FROM tables_v6
                WHERE date_str >= '2019-01-01'
               )
            SELECT <some_fields>,
                   case when x3.some_field=x2.some_field then x3.some_field end as some_field,
                   x3.some_id as x3.some_id
            FROM
            
              (SELECT DISTINCT some_filed,
                               more_fields
               FROM tables_v7 sve
               INNER JOIN tables_v8 lfd ON sve.field1 = lfd.field2
               AND sve.type = sve.type
               AND coalesce(sve.field,"somestring") = lfd.field
               AND sve.created_at BETWEEN lfd.min_start_time AND coalesce(lfd.max_end_time,'2030-12-31T00:00:00.000+00:00')
               INNER JOIN tables_v9 aau ON sve.some_id = aau.some_id
               WHERE aau.some_id IS NOT NULL) x1
                INNER JOIN tables_v10 ild on ild.hash_value=x1.hash_value
            LEFT JOIN
              (SELECT DISTINCT some_filed,
                               <some_fields>
               FROM tables_v10 sve
               INNER JOIN full_data lfd ON sve.some_id = lfd.some_id)
              x2 ON x1.some_id_1 = x2.some_id_1
            AND x1.created_at < x2.created_at
             and ild.lcid = x2.id
            LEFT JOIN
              (SELECT <some_fileds>
               FROM table_v11  where status='SUCCESS')
              x3 ON x2._id = x3._id
            AND x2.created_at < x3.checked_out_at  )
SELECT
	field_1,
	field_2,
  field_3,
	TO_DATE(FROM_UTC_TIMESTAMP(started_date ,'Asia/Bangkok')) AS started_date_1,
	TO_DATE(FROM_UTC_TIMESTAMP(ended_date ,'Asia/Bangkok')) AS ended_date_1,
	(((DATEDIFF(FROM_UTC_TIMESTAMP(banner_ended ,'Asia/Bangkok'), '1970-01-04')%7 + 7)%7 - 1 + 7)%(7)) AS ended_day_of_week_index_1,
	DAYNAME(FROM_UTC_TIMESTAMP(banner_ended ,'Asia/Bangkok')) AS _ended_day_of_week_1,
	_present_on_screen_id  AS _test_screen_id_1,
	count(distinct _test.x2_session_id)/count(distinct _test.x1_session_id)  AS test_ctr_sessions_1,
	count(distinct _test.x2_anonymous_id)/count(distinct _test.x1_anonymous_id)  AS test_ctr_aid_1
FROM banner_test
WHERE ((_test.banner_started  >= TO_UTC_TIMESTAMP('1900-01-01 00:00:00.000000','Asia/Bangkok'))) AND (banner_test.region REGEXP '$') AND (banner_test.business REGEXP '$')
GROUP BY 1,2,3,4,5,6,7,8
ORDER BY _banner_started_date_1 DESC
LIMIT 500;

 

I have obfuscated some of the details here but information regarding joins, aggregations, group by etc is preserved.

 

Regards

Parth 

1 ACCEPTED SOLUTION

avatar

It looks like most likely a bug in the Impala planner with some of the estimated stats being calculated as negative and it triggering that error. I haven't seen exactly this symptom before, but I think it's most likely caused by https://issues.apache.org/jira/browse/IMPALA-7604.

 

This could cause it if the #rows estimate from the aggregation (GROUP BY) before the sort (ORDER BY) overflows and wraps around to being negative. For this to happen, the product of the distinct value count from the columns in the group by would have to be > 2^63. I.e. if you have GROUP BY a, b, c and each column has 10 distinct values, you would get a product of 10 * 10 * 10 = 1000.

It's possible that the stats changed somehow and before it wasn't being triggered.

CDH6.3.4 is the earliest release with a fix for it. You could work around by dropping stats on one of the tables, but that can have some pretty bad performance implications. You might also be able to work around by tweaking the query, e.g. grouping by different columns.

View solution in original post

7 REPLIES 7

avatar

Are you sure there isn't a stack trace for IllegalStateException in the impala daemon logs or catalog daemon logs? That would help match it to a bug.

avatar
Contributor

@Tim Armstrong  This is the stacktrace which I got from the impala demon INFO logs

I1216 18:24:31.520996  5823 jni-util.cc:256] 5348abd63d19b026:621ffa8a00000000] java.lang.IllegalStateException
	at com.google.common.base.Preconditions.checkState(Preconditions.java:133)
	at org.apache.impala.planner.SortNode.computeNodeResourceProfile(SortNode.java:258)
	at org.apache.impala.planner.PlanFragment.computeResourceProfile(PlanFragment.java:234)
	at org.apache.impala.planner.Planner.computeResourceReqs(Planner.java:388)
	at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1178)
	at org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:1466)
	at org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1345)
	at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1252)
	at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1222)
	at org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:167)
I1216 18:24:31.521031  5823 status.cc:124] 5348abd63d19b026:621ffa8a00000000] IllegalStateException: null

 

 

Parth 

avatar
Contributor

@Tim Armstrong Did the logs help?

avatar

It looks like most likely a bug in the Impala planner with some of the estimated stats being calculated as negative and it triggering that error. I haven't seen exactly this symptom before, but I think it's most likely caused by https://issues.apache.org/jira/browse/IMPALA-7604.

 

This could cause it if the #rows estimate from the aggregation (GROUP BY) before the sort (ORDER BY) overflows and wraps around to being negative. For this to happen, the product of the distinct value count from the columns in the group by would have to be > 2^63. I.e. if you have GROUP BY a, b, c and each column has 10 distinct values, you would get a product of 10 * 10 * 10 = 1000.

It's possible that the stats changed somehow and before it wasn't being triggered.

CDH6.3.4 is the earliest release with a fix for it. You could work around by dropping stats on one of the tables, but that can have some pretty bad performance implications. You might also be able to work around by tweaking the query, e.g. grouping by different columns.

avatar
Contributor

@Tim Armstrong  CDH 6.3.4 packaging  information suggests the Impala is still at 3.2.0 .

Am I looking at wrong page? We are eagerly waiting for Impala 3.3.0 & 3.4.0 to available in CDH package since September 2019 as they have very good features like Ranger Support, caching for remote filesystems liike S3.

 

Is there a way to run the Impala 3.4.0 with cloudera manager 6.3.1 and by passing the Impala which is provided by CDH 6.3.2?

avatar

The versions of Apache Impala in Cloudera are "based on" Apache Impala releases but will often include substantial additional features and fixes. We do a lot of work beyond just repackaging the upstream releases. Please consult the Cloudera release notes for the version of CDH/CDP you're using if you want to understand what features are present. Just looking at the version string won't give you that information.

 

CDH 6.3.4 includes that fix. See https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_634_fixed_issues.ht...

 

Most new features have been going into CDP only - we have been rolling out new Impala features in CDP public cloud on a fairly continual basis and then these are trickling down into CDP private cloud releases.

 

Some limited features have gone into minor CDH release - i.e. 6.3.0, but we've generally been prioritizing stability there and focusing on CDP. We shipped a preview version of the remote read cache in CDH 6.3.0 - https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_630_new_features.ht... . Ranger support is not in CDH - the general idea there is to allow migrating from Sentry to Ranger as part of the CDH->CDP upgrade process.

avatar
Contributor

@Tim Armstrong  Thanks for helping out here.

My apologies for mis-understanding w.r.t packing information.