Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to get the desired grouping result pig

Solved Go to solution
Highlighted

How to get the desired grouping result pig

Rising Star

Hi guys, I have been struggling with my Pig code and I haven't arrived at desired result so this is why I'm knocking you guys. Well, I have a file with some information and my idea is to get a counting by reference number. so as an overview I have done:

16011-pig.png

So the third step worked but the problem is that it generates a huge tuple to include the reference number by each tuple in my grouping bag that contains the number so the output it's like:

16012-pig.png

Then I tried the fourth step but although I got the counting list I missed the reference_number so I would like to get the same list but just once the reference code.

Thanks so much for your help team. @Lester Martin

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: How to get the desired grouping result pig

Hi @Andres Urrego,

You need to modify your third step like.

october_total_station = FOREACH october_station_gr GENERATE FLATTEN(group) , COUNT(october.s_station);

View solution in original post

2 REPLIES 2
Highlighted

Re: How to get the desired grouping result pig

Hi @Andres Urrego,

You need to modify your third step like.

october_total_station = FOREACH october_station_gr GENERATE FLATTEN(group) , COUNT(october.s_station);

View solution in original post

Highlighted

Re: How to get the desired grouping result pig

Rising Star

got it, I start to understand how works the grouping in Pig . actually to be sure i did:

october_gr_counting = FOREACH october_station_gr GENERATE group , COUNT(october)

thanks so much buddy.

Don't have an account?
Coming from Hortonworks? Activate your account here