Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Handling Multiple joins creating duplicates

Solved Go to solution

Re: Handling Multiple joins creating duplicates

Contributor

@Sonny Heer

I think you can do that.

Instead of this:

Select B.b,B.key,ROW_NUMBER() OVER (partition by key) AS row_num from B)where row_num=1

You can use

Select B.b,B.key,ROW_NUMBER() OVER (count by key) AS row_num from B)where row_num=1

Though I am not very sure, but Hive documentation says you can use standard aggregate in Over function. Check the link below:

Hive Documentation

Cheers,

Sagar

Highlighted

Re: Handling Multiple joins creating duplicates

Contributor
@Sagar Morakhia

That doesn't seem to work. based on doc it shows below, but that also requires a group by. I might be missing something.

Select COUNT(B.b),B.key,ROW_NUMBER() OVER (partition by key) AS row_num from B)where row_num=1

Don't have an account?
Coming from Hortonworks? Activate your account here