Support Questions

Find answers, ask questions, and share your expertise

Hive built-in function Grouping_id not working as per the hive manual HDP 2.5 and above

avatar

Hi guys,

I have a unique issue. Grouping__id function doesnt seem to be worked as expected or as shown in the hive manual.

I am executing the same example as shown in the hive guide.

https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup

I could find any open bugs but I have tested it out in 3 different version of hdp

HDP2.4

The statement executed are here below and the results are as expected and everything works fine

-- create table 
create table grp_tst( col1 int,col2 int);
-- insert query 
insert into table grp_tst values (1, NULL);
insert into table grp_tst values (1, 1);
insert into table grp_tst values (2, 2);
insert into table grp_tst values (3, 3);
insert into table grp_tst values (3, NULL);
insert into table grp_tst values (4, 5);
-- select query 
SELECT col1,
col2, GROUPING__ID, count(*) from grp_tst GROUP BY col1, col2 WITH ROLLUP


 








 




Results

col1col2grouping_idcount
NULLNULL06
1NULL12
1NULL31
1131
2NULL11
2231
3NULL12
3

NULL

31
3331
4NULL11
4531

HDP2.5 and HDP2.6.0 -

Both the resutls seems to be wrong but consistently wrong. So i am wondering if there was a bug introduced in Hive from 2.4 to 2.5 upgrade.

col1col2grouping_idcount
NULLNULL36
1NULL01
1NULL12
1101
2NULL11
2201
3NULL01
3NULL12
3301
4NULL11
4501

Hive setting: I am happy to share any property files if required. It would be great if you could help me out here.

1 ACCEPTED SOLUTION

avatar
Cloudera Employee

@Abhijit Nayak, thanks for your message. This is intentional (see https://issues.apache.org/jira/browse/HIVE-16102). Grouping id function is now compliant with behavior of other SQL engines. SQL grouping function has been added to Hive too for convenience (see https://issues.apache.org/jira/browse/HIVE-15409 and https://issues.apache.org/jira/browse/HIVE-15996). I have updated the documentation in the Hive site accordingly:

https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup

View solution in original post

2 REPLIES 2

avatar
Cloudera Employee

@Abhijit Nayak, thanks for your message. This is intentional (see https://issues.apache.org/jira/browse/HIVE-16102). Grouping id function is now compliant with behavior of other SQL engines. SQL grouping function has been added to Hive too for convenience (see https://issues.apache.org/jira/browse/HIVE-15409 and https://issues.apache.org/jira/browse/HIVE-15996). I have updated the documentation in the Hive site accordingly:

https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup

avatar

Thanks a lot