Hive QL - Aggregating within a group

maeve_ryan226 — Fri, 25 Mar 2016 03:33:59 GMT

Hi there,

I am new to the Hive QL language and am trying to solve the following problem.

I have a set of data with user Id's, each with a corresponding score. An example of the kind of data I have is below:

stackdata_clean.owneruserid	stackdata_clean.score
1	5
2	6
3	5
1	4
2	4

I want to find the top 10 users by score. In other words, I want code to make a table like the below and then pick the top 10 users with the highest aggregate score from it:

stackdata_clean.owneruserid	stackdata_clean.score
2	10
1	9
3	5

My table name is stackdata_clean and the code I am trying to use is:

SELECT stackdata_clean.owneruserid,
SUM(stackdata_clean.score) over(PARTITION BY stackdata_clean.owneruserid)
FROM stackdata_clean
GROUP BY stackdata_clean.owneruserid
ORDER BY sum(stackdata_clean.score)DESC LIMIT 10;

I am being returned the following error:

Error while compiling statement: FAILED: 
SemanticException Failed to breakup Windowing invocations into Groups. 
At least 1 group must only depend on input columns. Also check for 
circular dependencies.
Underlying error: org.apache.hadoop.hive.ql.parse.SemanticException: 
Line 2:20 Invalid column reference 'score' [ERROR_STATUS]

Can anyone help solve this problem?

Any help is greatly appreciated!

Thanks in advance 🙂

Re: Hive QL - Aggregating within a group

sluangsay — Fri, 25 Mar 2016 16:15:41 GMT

You're doing a window function and a group by on the same column, and that seems to be your error.

Try this:

SELECT stackdata_clean.owneruserid, 
SUM(stackdata_clean.score) as sumscore
FROM stackdata_clean
GROUP BY stackdata_clean.owneruserid
ORDER BY sumscore DESC LIMIT 10;

Re: Hive QL - Aggregating within a group

maeve_ryan226 — Sat, 26 Mar 2016 16:34:18 GMT

That worked - thanks a lot!

question Re: Hive QL - Aggregating within a group in Archives of Support Questions (Read Only)

Hive QL - Aggregating within a group

Re: Hive QL - Aggregating within a group

Re: Hive QL - Aggregating within a group