Created 11-14-2017 07:26 PM
Hi guys,
I have a unique issue. Grouping__id function doesnt seem to be worked as expected or as shown in the hive manual.
I am executing the same example as shown in the hive guide.
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup
I could find any open bugs but I have tested it out in 3 different version of hdp
HDP2.4
The statement executed are here below and the results are as expected and everything works fine
-- create table create table grp_tst( col1 int,col2 int); -- insert query insert into table grp_tst values (1, NULL); insert into table grp_tst values (1, 1); insert into table grp_tst values (2, 2); insert into table grp_tst values (3, 3); insert into table grp_tst values (3, NULL); insert into table grp_tst values (4, 5); -- select query SELECT col1, col2, GROUPING__ID, count(*) from grp_tst GROUP BY col1, col2 WITH ROLLUP
Results
col1 | col2 | grouping_id | count |
NULL | NULL | 0 | 6 |
1 | NULL | 1 | 2 |
1 | NULL | 3 | 1 |
1 | 1 | 3 | 1 |
2 | NULL | 1 | 1 |
2 | 2 | 3 | 1 |
3 | NULL | 1 | 2 |
3 | NULL | 3 | 1 |
3 | 3 | 3 | 1 |
4 | NULL | 1 | 1 |
4 | 5 | 3 | 1 |
HDP2.5 and HDP2.6.0 -
Both the resutls seems to be wrong but consistently wrong. So i am wondering if there was a bug introduced in Hive from 2.4 to 2.5 upgrade.
col1 | col2 | grouping_id | count |
NULL | NULL | 3 | 6 |
1 | NULL | 0 | 1 |
1 | NULL | 1 | 2 |
1 | 1 | 0 | 1 |
2 | NULL | 1 | 1 |
2 | 2 | 0 | 1 |
3 | NULL | 0 | 1 |
3 | NULL | 1 | 2 |
3 | 3 | 0 | 1 |
4 | NULL | 1 | 1 |
4 | 5 | 0 | 1 |
Hive setting: I am happy to share any property files if required. It would be great if you could help me out here.
Created 11-20-2017 09:10 PM
@Abhijit Nayak, thanks for your message. This is intentional (see https://issues.apache.org/jira/browse/HIVE-16102). Grouping id function is now compliant with behavior of other SQL engines. SQL grouping function has been added to Hive too for convenience (see https://issues.apache.org/jira/browse/HIVE-15409 and https://issues.apache.org/jira/browse/HIVE-15996). I have updated the documentation in the Hive site accordingly:
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup
Created 11-20-2017 09:10 PM
@Abhijit Nayak, thanks for your message. This is intentional (see https://issues.apache.org/jira/browse/HIVE-16102). Grouping id function is now compliant with behavior of other SQL engines. SQL grouping function has been added to Hive too for convenience (see https://issues.apache.org/jira/browse/HIVE-15409 and https://issues.apache.org/jira/browse/HIVE-15996). I have updated the documentation in the Hive site accordingly:
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup
Created 11-24-2017 12:28 AM
Thanks a lot