Created on 06-11-2019 06:54 PM - edited 09-16-2022 07:26 AM
0: jdbc:hive2://localhost:21050/defalt> select 1 * (NULL); null +-------------+--+ | 1 * (null) | +-------------+--+ +-------------+--+
vs.
0: jdbc:hive2://localhost:21050/defalt> select 1 * (CASE WHEN '1' IS NULL THEN NULL ELSE NULL END); Error: AnalysisException: Arithmetic operation requires numeric operands: 1 * (CASE WHEN '1' IS NULL THEN NULL ELSE NULL END) (state=HY000,code=0)
Is this a bug in Impala?
Impala version is 2.11 (CDH 5.14.4)
Same statement works with Hive.
Created 06-13-2019 10:42 AM
Yeah I agree there is some inconsistency in behaviour here - the casting rules, especially around NULL, are too complex and inconsistent.
Created 06-11-2019 11:27 PM
I see what you're saying there. The different from Hive isn't deliberate, but I believe this only happens in the very special case when all of the values in the CASE are NULL, without any cast. If any of them have a type, the query will execute. E.g. the below query runs and returns NULL.
select 1 * (CASE WHEN '1' IS NULL THEN NULL ELSE cast(NULL as integer) END);
Interestingly postgres behaves the same way as Impala
psql (9.5.17, server 9.4.8) Type "help" for help. postgres=# select 1 * (CASE WHEN '1' IS NULL THEN NULL ELSE NULL END); ERROR: operator does not exist: integer * text LINE 1: select 1 * (CASE WHEN '1' IS NULL THEN NULL ELSE NULL END); ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. postgres=# select 1 * (NULL); ?column? ---------- (1 row) postgres=# select 1 * (NULL); ?column? ---------- (1 row)
We could probably file a bug if this is causing issues for you, there no specific reason it shouldn't work that I know of.
Created on 06-12-2019 12:35 AM - edited 06-12-2019 12:42 AM
Thank you for your reply, @Tim Armstrong !
If Impala doesn't throw Exception by CASE, that would be very nice.
But now I found more interesting issue 🙂
Old Impala 2.5 => No error
[root@quickstart ~]# beeline -u "jdbc:hive2://`hostname -f`:21050/;auth=noSasl" 2019-06-12 07:27:40,131 WARN [main] mapreduce.TableMapReduceUtil: The hbase-prefix-tree module jar containing PrefixTreeCodec is not present. Continuing without it. scan complete in 2ms Connecting to jdbc:hive2://quickstart.cloudera:21050/;auth=noSasl Connected to: Impala (version 2.5.0-cdh5.7.0) Driver: Hive JDBC (version 1.1.0-cdh5.7.0) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 1.1.0-cdh5.7.0 by Apache Hive 0: jdbc:hive2://quickstart.cloudera:21050/> SELECT (-1 * CASE WHEN CAST( 0: jdbc:hive2://quickstart.cloudera:21050/> SUM( 0: jdbc:hive2://quickstart.cloudera:21050/> CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END 0: jdbc:hive2://quickstart.cloudera:21050/> ) AS DECIMAL(32, 2) 0: jdbc:hive2://quickstart.cloudera:21050/> ) IS NULL THEN NULL WHEN COALESCE(CAST(NULL AS DOUBLE), 0) = 0 THEN NULL ELSE CAST( 0: jdbc:hive2://quickstart.cloudera:21050/> CASE WHEN CAST(NULL AS DOUBLE) = 0 THEN NULL ELSE ( 0: jdbc:hive2://quickstart.cloudera:21050/> CAST( 0: jdbc:hive2://quickstart.cloudera:21050/> SUM( 0: jdbc:hive2://quickstart.cloudera:21050/> CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END 0: jdbc:hive2://quickstart.cloudera:21050/> ) AS DECIMAL(32, 2) 0: jdbc:hive2://quickstart.cloudera:21050/> ) / CAST(NULL AS DOUBLE) 0: jdbc:hive2://quickstart.cloudera:21050/> ) END AS DOUBLE 0: jdbc:hive2://quickstart.cloudera:21050/> ) END) FROM (SELECT 1 as col1, 2 as col2) dummy_t; null +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ | (-1 * 1 * case when cast(sum(case when dummy_t.col1 = 1 then dummy_t.col2 else 0 end) as decimal(32,2)) is null then null when coalesce(cast(null as double), 0) = 0 then null else cast(case when cast(null as double) = 0 then null else (cast(sum(case when dummy_t.col1 = 1 then dummy_t.col2 else 0 end) as decimal(32,2)) / cast(null as double)) end as double) end) | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ No rows selected (0.093 seconds) 0: jdbc:hive2://quickstart.cloudera:21050/>
Newer Impala 2.11 and 3.0 doesn't like above:
[root@test /]# beeline -u "jdbc:hive2://`hostname -f`:21050/;auth=noSasl" scan complete in 2ms Connecting to jdbc:hive2://test.impala.com:21050/;auth=noSasl Connected to: Impala (version 2.11.0-cdh5.14.4) Driver: Hive JDBC (version 1.1.0-cdh5.14.4) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 1.1.0-cdh5.14.4 by Apache Hive 0: jdbc:hive2://hadoop-node-01.docker.infra.a> SELECT (-1 * CASE WHEN CAST( . . . . . . . . . . . . . . . . . . . . . . .> SUM( . . . . . . . . . . . . . . . . . . . . . . .> CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END . . . . . . . . . . . . . . . . . . . . . . .> ) AS DECIMAL(32, 2) . . . . . . . . . . . . . . . . . . . . . . .> ) IS NULL THEN NULL WHEN COALESCE(CAST(NULL AS DOUBLE), 0) = 0 THEN NULL ELSE CAST( . . . . . . . . . . . . . . . . . . . . . . .> CASE WHEN CAST(NULL AS DOUBLE) = 0 THEN NULL ELSE ( . . . . . . . . . . . . . . . . . . . . . . .> CAST( . . . . . . . . . . . . . . . . . . . . . . .> SUM( . . . . . . . . . . . . . . . . . . . . . . .> CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END . . . . . . . . . . . . . . . . . . . . . . .> ) AS DECIMAL(32, 2) . . . . . . . . . . . . . . . . . . . . . . .> ) / CAST(NULL AS DOUBLE) . . . . . . . . . . . . . . . . . . . . . . .> ) END AS DOUBLE . . . . . . . . . . . . . . . . . . . . . . .> ) END) FROM (SELECT 1 as col1, 2 as col2) dummy_t; Error: AnalysisException: Arithmetic operation requires numeric operands: (-1 * CASE WHEN CAST(sum(CASE WHEN dummy_t.col1 = 1 THEN dummy_t.col2 ELSE 0 END) AS DECIMAL(32,2)) IS NULL THEN NULL ELSE NULL END) (state=HY000,code=0)
Did I find a bug this time? 😉
Created on 06-12-2019 12:39 AM - edited 06-12-2019 12:43 AM
I mean, the output/result of the following query is different by Impala version:
SELECT (-1 * CASE WHEN CAST( SUM( CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END ) AS DECIMAL(32, 2) ) IS NULL THEN NULL WHEN COALESCE(CAST(NULL AS DOUBLE), 0) = 0 THEN NULL ELSE CAST( CASE WHEN CAST(NULL AS DOUBLE) = 0 THEN NULL ELSE ( CAST( SUM( CASE when dummy_t.col1 = 1 then dummy_t.col2 else 0 END ) AS DECIMAL(32, 2) ) / CAST(NULL AS DOUBLE) ) END AS DOUBLE ) END) FROM (SELECT 1 as col1, 2 as col2) dummy_t;
Created 06-12-2019 10:48 AM
We have made some changes to decimal type resolution behaviour in recent versions. Most of them are controlled by the DECIMAL_V2 option and the default was changed starting in Impala 3.0.
The best way forward is to report a bug in the Impala issue tracker - https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala with the steps required to reproduce the symptoms you see. Then it can be tracked and triaged properly. This forum is not great for doing that.
Created 06-12-2019 06:55 PM
Thanks again for your reply.
I submitted https://issues.apache.org/jira/browse/IMPALA-8664 . Hope it makes sense.
After reading your response, I thought DECIMAL would be the issue, but it seems not.
Using Impala 2.11 (CDH 5.14.4), no DECIMAL, no CAST, but still fails:
0: jdbc:hive2://test> SELECT (-1 * CASE WHEN SUM(1) IS NULL THEN NULL WHEN COALESCE(NULL, 0) = 0 THEN NULL ELSE 0 END) FROM (SELECT 1 as ytd_id, 2 as derv_alw_amt) dummy_t;
Error: AnalysisException: Arithmetic operation requires numeric operands: (-1 * CASE WHEN sum(1) IS NULL THEN NULL ELSE NULL END) (state=HY000,code=0)
No COALESCE, works!
0: jdbc:hive2://test> SELECT (-1 * CASE WHEN SUM(1) IS NULL THEN NULL WHEN 1 = 0 THEN NULL ELSE 0 END) FROM (SELECT 1 as ytd_id, 2 as derv_alw_amt) dummy_t; 0
BTW, because of the inconsistency I mentioned in the first comment, faliing below MAKES SENSE (because CASE returns NULL):
0: jdbc:hive2://test> SELECT (-1 * CASE WHEN SUM(1) IS NULL THEN NULL WHEN 0 = 0 THEN NULL ELSE 0 END) FROM (SELECT 1 as ytd_id, 2 as derv_alw_amt) dummy_t; Error: AnalysisException: Arithmetic operation requires numeric operands: (-1 * CASE WHEN sum(1) IS NULL THEN NULL ELSE NULL END) (state=HY000,code=0)
HOWEVER, removing SUM returns different result:
0: jdbc:hive2://test> SELECT (-1 * CASE WHEN COALESCE(NULL, 0) = 0 THEN NULL ELSE 0 END) FROM (SELECT 1 as ytd_id, 2 as derv_alw_amt) dummy_t; null
So this is a bit too confusing for me, and I think this would be an Impala bug.
Created 06-13-2019 10:42 AM
Yeah I agree there is some inconsistency in behaviour here - the casting rules, especially around NULL, are too complex and inconsistent.