Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to find nth highest salary in hive.

avatar
Expert Contributor

Hi guys,

I know hive is Data Warehousing tool which is not suited as OLTP. In regular RDBMS Structure i have way like "Correlated queries" helps to find nth highest salary from a table as following.

Is there any way in hive also.The query is related to true/false conditions.

SELECT * /*This is the outer query part */
FROM Employee Emp1
WHERE (N-1) = ( /* Subquery starts here */
SELECT COUNT(DISTINCT(Emp2.Salary))
FROM Employee Emp2
WHERE Emp2.Salary > Emp1.Salary)

1 ACCEPTED SOLUTION

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
8 REPLIES 8

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar

Awesome! Marked as favorite post now. 🙂

avatar

Thanks 🙂

avatar
New Contributor

Ho to find 2nd(nth) highest salary in each department? Please find below sample dataset.

1,423,depA
2,23,depA
3,67878,depB
11,84,depB
5,9999999,depA
9,4445,depA
8,8877,depB
10,756,depC
4,84,depC
7,578,depA
6,1,depC

output-

10,756,depC
4,84,depC
6,1,depC

avatar

Hi @LetsHadoop

As this thread is older and was marked 'Solved' in Dec 2015 you would have a better chance of receiving a helpful answer by starting a new thread. This will also provide you with an opportunity to provide details about your table design that could aid others in providing an answer more directly tailored to your question.

 

 

Bill Brooks, Community Moderator
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
New Contributor

hi i tried this but got the below error. pls do help me out. I am quite new to hive.

hive> select salary, ROW_NUMBER() over (ORDER BY salary) as row_no from emp group by salary; FAILED: ParseException line 1:33 mismatched input '(' expecting FROM near 'over' in from clause

avatar
New Contributor

Use DENSE_RANK instead of ROW_NUMBER() with GROUP BY , as it will take only 1 map-reduce job to achieve it as compared to 2 map-reduce jobs by latter one.


select * from (

select salary, DENSE_RANK() over (ORDER BY salary DESC) as row_no from emp) res

where res.row_no = 2;

avatar
New Contributor

Please watch the below video:-

https://www.youtube.com/watch?v=fBeN8VQm0_Q

Please like comment share and subscribe my channel, If you loved it.