Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Inconsistency by 1 * (NULL), and causes AnalysisException

avatar
Explorer
0: jdbc:hive2://localhost:21050/defalt> select 1 * (NULL);
null
+-------------+--+
| 1 * (null)  |
+-------------+--+
+-------------+--+

vs.

0: jdbc:hive2://localhost:21050/defalt> select 1 * (CASE WHEN '1' IS NULL THEN NULL ELSE NULL END);
Error: AnalysisException: Arithmetic operation requires numeric operands: 1 * (CASE WHEN '1' IS NULL THEN NULL ELSE NULL END) (state=HY000,code=0)

Is this a bug in Impala?

Impala version is 2.11 (CDH 5.14.4)

 

Same statement works with Hive.

1 ACCEPTED SOLUTION

avatar

Yeah I agree there is some inconsistency in behaviour here - the casting rules, especially around NULL, are too complex and inconsistent.

View solution in original post

6 REPLIES 6

avatar

I see what you're saying there. The different from Hive isn't deliberate, but I believe this only happens in the very special case when all of the values in the CASE are NULL, without any cast. If any of them have a type, the query will execute. E.g. the below query runs and returns NULL.

 

select 1 * (CASE WHEN '1' IS NULL THEN NULL ELSE cast(NULL as integer) END);

Interestingly postgres behaves the same way as Impala

 

psql (9.5.17, server 9.4.8)
Type "help" for help.

postgres=# select 1 * (CASE WHEN '1' IS NULL THEN NULL ELSE NULL END);
ERROR:  operator does not exist: integer * text
LINE 1: select 1 * (CASE WHEN '1' IS NULL THEN NULL ELSE NULL END);
                 ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.
postgres=#  select 1 * (NULL);
 ?column? 
----------
         
(1 row)

postgres=# select 1 * (NULL);
 ?column? 
----------
         
(1 row)

 

We could probably file a bug if this is causing issues for you, there no specific reason it shouldn't work that I know of.

avatar
Explorer

Thank you for your reply, @Tim Armstrong !

If Impala doesn't throw Exception by CASE, that would be very nice.

 

But now I found more interesting issue 🙂

 

Old Impala 2.5 => No error

[root@quickstart ~]# beeline -u "jdbc:hive2://`hostname -f`:21050/;auth=noSasl"
2019-06-12 07:27:40,131 WARN  [main] mapreduce.TableMapReduceUtil: The hbase-prefix-tree module jar containing PrefixTreeCodec is not present.  Continuing without it.
scan complete in 2ms
Connecting to jdbc:hive2://quickstart.cloudera:21050/;auth=noSasl
Connected to: Impala (version 2.5.0-cdh5.7.0)
Driver: Hive JDBC (version 1.1.0-cdh5.7.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.1.0-cdh5.7.0 by Apache Hive
0: jdbc:hive2://quickstart.cloudera:21050/> SELECT (-1 * CASE WHEN CAST(
0: jdbc:hive2://quickstart.cloudera:21050/>    SUM(
0: jdbc:hive2://quickstart.cloudera:21050/>            CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END
0: jdbc:hive2://quickstart.cloudera:21050/>        ) AS DECIMAL(32, 2)
0: jdbc:hive2://quickstart.cloudera:21050/> ) IS NULL THEN NULL WHEN COALESCE(CAST(NULL AS DOUBLE), 0) = 0 THEN NULL ELSE CAST(
0: jdbc:hive2://quickstart.cloudera:21050/>    CASE WHEN CAST(NULL AS DOUBLE) = 0 THEN NULL ELSE (
0: jdbc:hive2://quickstart.cloudera:21050/>            CAST(
0: jdbc:hive2://quickstart.cloudera:21050/>                    SUM(
0: jdbc:hive2://quickstart.cloudera:21050/>                            CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END
0: jdbc:hive2://quickstart.cloudera:21050/>                        ) AS DECIMAL(32, 2)
0: jdbc:hive2://quickstart.cloudera:21050/>                ) / CAST(NULL AS DOUBLE)
0: jdbc:hive2://quickstart.cloudera:21050/>        ) END AS DOUBLE
0: jdbc:hive2://quickstart.cloudera:21050/> ) END) FROM (SELECT 1 as col1, 2 as col2) dummy_t;
null
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| (-1 * 1 * case when cast(sum(case when dummy_t.col1 = 1 then dummy_t.col2 else 0 end) as decimal(32,2)) is null then null when coalesce(cast(null as double), 0) = 0 then null else cast(case when cast(null as double) = 0 then null else (cast(sum(case when dummy_t.col1 = 1 then dummy_t.col2 else 0 end) as decimal(32,2)) / cast(null as double)) end as double) end)  |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
No rows selected (0.093 seconds)
0: jdbc:hive2://quickstart.cloudera:21050/>

Newer Impala 2.11 and 3.0 doesn't like above:

[root@test /]# beeline -u "jdbc:hive2://`hostname -f`:21050/;auth=noSasl"
scan complete in 2ms
Connecting to jdbc:hive2://test.impala.com:21050/;auth=noSasl
Connected to: Impala (version 2.11.0-cdh5.14.4)
Driver: Hive JDBC (version 1.1.0-cdh5.14.4)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.1.0-cdh5.14.4 by Apache Hive
0: jdbc:hive2://hadoop-node-01.docker.infra.a> SELECT (-1 * CASE WHEN CAST(
. . . . . . . . . . . . . . . . . . . . . . .>    SUM(
. . . . . . . . . . . . . . . . . . . . . . .>            CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END
. . . . . . . . . . . . . . . . . . . . . . .>        ) AS DECIMAL(32, 2)
. . . . . . . . . . . . . . . . . . . . . . .> ) IS NULL THEN NULL WHEN COALESCE(CAST(NULL AS DOUBLE), 0) = 0 THEN NULL ELSE CAST(
. . . . . . . . . . . . . . . . . . . . . . .>    CASE WHEN CAST(NULL AS DOUBLE) = 0 THEN NULL ELSE (
. . . . . . . . . . . . . . . . . . . . . . .>            CAST(
. . . . . . . . . . . . . . . . . . . . . . .>                    SUM(
. . . . . . . . . . . . . . . . . . . . . . .>                            CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END
. . . . . . . . . . . . . . . . . . . . . . .>                        ) AS DECIMAL(32, 2)
. . . . . . . . . . . . . . . . . . . . . . .>                ) / CAST(NULL AS DOUBLE)
. . . . . . . . . . . . . . . . . . . . . . .>        ) END AS DOUBLE
. . . . . . . . . . . . . . . . . . . . . . .> ) END) FROM (SELECT 1 as col1, 2 as col2) dummy_t;
Error: AnalysisException: Arithmetic operation requires numeric operands: (-1 * CASE WHEN CAST(sum(CASE WHEN dummy_t.col1 = 1 THEN dummy_t.col2 ELSE 0 END) AS DECIMAL(32,2)) IS NULL THEN NULL ELSE NULL END) (state=HY000,code=0)

Did I find a bug this time? 😉

avatar
Explorer

I mean, the output/result of the following query is different by Impala version:

SELECT (-1 * CASE WHEN CAST(
   SUM(
           CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END
       ) AS DECIMAL(32, 2)
) IS NULL THEN NULL WHEN COALESCE(CAST(NULL AS DOUBLE), 0) = 0 THEN NULL ELSE CAST(
   CASE WHEN CAST(NULL AS DOUBLE) = 0 THEN NULL ELSE (
           CAST(
                   SUM(
                           CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END
                       ) AS DECIMAL(32, 2)
               ) / CAST(NULL AS DOUBLE)
       ) END AS DOUBLE
) END) FROM (SELECT 1 as col1, 2 as col2) dummy_t;

 

avatar

We have made some changes to decimal type resolution behaviour in recent versions. Most of them are controlled by the DECIMAL_V2 option and the default was changed starting in Impala 3.0.

 

The best way forward is to report a bug in the Impala issue tracker - https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala with the steps required to reproduce the symptoms you see. Then it can be tracked and triaged properly. This forum is not great for doing that.

avatar
Explorer

Hi @Tim Armstrong 

 

Thanks again for your reply. 

I submitted https://issues.apache.org/jira/browse/IMPALA-8664 . Hope it makes sense.

 

After reading your response, I thought DECIMAL would be the issue, but it seems not.

Using Impala 2.11 (CDH 5.14.4), no DECIMAL, no CAST, but still fails:

 

0: jdbc:hive2://test> SELECT (-1 * CASE WHEN SUM(1) IS NULL THEN NULL WHEN COALESCE(NULL, 0) = 0 THEN NULL ELSE 0 END) FROM (SELECT 1 as ytd_id, 2 as derv_alw_amt) dummy_t;
Error: AnalysisException: Arithmetic operation requires numeric operands: (-1 * CASE WHEN sum(1) IS NULL THEN NULL ELSE NULL END) (state=HY000,code=0)

No COALESCE, works!

 

 

0: jdbc:hive2://test> SELECT (-1 * CASE WHEN SUM(1) IS NULL THEN NULL WHEN 1 = 0 THEN NULL ELSE 0 END) FROM (SELECT 1 as ytd_id, 2 as derv_alw_amt) dummy_t;
0

 

 

BTW, because of the inconsistency I mentioned in the first comment, faliing below MAKES SENSE (because CASE returns NULL):

 

0: jdbc:hive2://test> SELECT (-1 * CASE WHEN SUM(1) IS NULL THEN NULL WHEN 0 = 0 THEN NULL ELSE 0 END) FROM (SELECT 1 as ytd_id, 2 as derv_alw_amt) dummy_t;
Error: AnalysisException: Arithmetic operation requires numeric operands: (-1 * CASE WHEN sum(1) IS NULL THEN NULL ELSE NULL END) (state=HY000,code=0)

 

HOWEVER, removing SUM returns different result:

 

0: jdbc:hive2://test> SELECT (-1 * CASE WHEN COALESCE(NULL, 0) = 0 THEN NULL ELSE 0 END) FROM (SELECT 1 as ytd_id, 2 as derv_alw_amt) dummy_t;
null

 

 

So this is a bit too confusing for me, and I think this would be an Impala bug.

 

avatar

Yeah I agree there is some inconsistency in behaviour here - the casting rules, especially around NULL, are too complex and inconsistent.