Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to get the desired grouping result pig

avatar
Expert Contributor

Hi guys, I have been struggling with my Pig code and I haven't arrived at desired result so this is why I'm knocking you guys. Well, I have a file with some information and my idea is to get a counting by reference number. so as an overview I have done:

16011-pig.png

So the third step worked but the problem is that it generates a huge tuple to include the reference number by each tuple in my grouping bag that contains the number so the output it's like:

16012-pig.png

Then I tried the fourth step but although I got the counting list I missed the reference_number so I would like to get the same list but just once the reference code.

Thanks so much for your help team. @Lester Martin

1 ACCEPTED SOLUTION

avatar

Hi @Andres Urrego,

You need to modify your third step like.

october_total_station = FOREACH october_station_gr GENERATE FLATTEN(group) , COUNT(october.s_station);

View solution in original post

2 REPLIES 2

avatar

Hi @Andres Urrego,

You need to modify your third step like.

october_total_station = FOREACH october_station_gr GENERATE FLATTEN(group) , COUNT(october.s_station);

avatar
Expert Contributor

got it, I start to understand how works the grouping in Pig . actually to be sure i did:

october_gr_counting = FOREACH october_station_gr GENERATE group , COUNT(october)

thanks so much buddy.