Reply
Highlighted
Explorer
Posts: 13
Registered: ‎10-25-2015

NULL in results if GROUP BY query

I have an Impala query like this:

Select col_1, sum(col_2)
from my_table
where col_1 ('a', 'b')
group by col_a;

The result shows values for 'a' and 'b' but also for 'NULL'. Why would this be?

Thx.

 

 

 

Cloudera Employee
Posts: 307
Registered: ‎10-16-2013

Re: NULL in results if GROUP BY query

Are you sure that's the exact query you ran? It looks semantically invalid because col_1 is not part of the group by clause. Can you please paste the exact query just to be sure there are no misunderstandings? Thanks!

Explorer
Posts: 13
Registered: ‎10-25-2015

Re: NULL in results if GROUP BY query

[ Edited ]

Sorry - made a typo when I sanitised the query. Would be:

Select col_1, sum(col_2)
from my_table
where col_1 in ('a', 'b')
group by col_1;

Result looks something like

a	17
b	59
NULL	426

Thx.

Cloudera Employee
Posts: 307
Registered: ‎10-16-2013

Re: NULL in results if GROUP BY query

That definitely looks wrong. What Impala version are you running?

Explorer
Posts: 13
Registered: ‎10-25-2015

Re: NULL in results if GROUP BY query

impalad version 2.5.0-cdh5.7.1 RELEASE (build 27a4325c18c2a01c7a8097681a0eccf6d4335ea1)

Cloudera Employee
Posts: 307
Registered: ‎10-16-2013

Re: NULL in results if GROUP BY query

[ Edited ]

Any chance you can provide the table schema and data for us to reproduce?

Explorer
Posts: 13
Registered: ‎10-25-2015

Re: NULL in results if GROUP BY query

The query was heavily sanitised for public consumption.

I will see what I can do in terms of putting together a reproduceable example. Will make it easier to raise a support case anyway.

Thx. Scott

Cloudera Employee
Posts: 307
Registered: ‎10-16-2013

Re: NULL in results if GROUP BY query

We take this issue seriously since it is a correctness problem. Any additional non-sensitive info you can provide would be helpful, like the column types, table format, etc. Thanks for helping!

Explorer
Posts: 13
Registered: ‎10-25-2015

Re: NULL in results if GROUP BY query

I have tried and been unable to reproduce the problem from scratch with a trivial table that matches the sanitised structure I have posted here on a quickstart VM.

 

I will get someone to spend a bit more time on this to see what they can reproduce on the cluster that experiences the issue.

Cloudera Employee
Posts: 307
Registered: ‎10-16-2013

Re: NULL in results if GROUP BY query

Thanks for looking into it! Please do let us know if you find anything since it sounds like a serious issue.