Support Questions
Find answers, ask questions, and share your expertise

Pig flatten and cogroup clarification

Pig flatten and cogroup clarification

Contributor

1)I have two queries on flatten and cogroup

Doubt 1:

student_details.txt
001,Rajiv,Reddy,21,9848022337,Hyderabad
002,siddarth,Battacharya,22,9848022338,Kolkata
003,Rajesh,Khanna,22,9848022339,Delhi
004,Preethi,Agarwal,21,9848022330,Pune
005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
006,Archana,Mishra,23,9848022335,Chennai
007,Komal,Nayak,24,9848022334,trivendram
008,Bharathi,Nambiayar,24,9848022333,Chennai
employee_details.txt
001,Robin,22,newyork 
002,BOB,23,Kolkata 
003,Maya,23,Tokyo 
004,Sara,25,London 
005,David,23,Bhuwaneshwar 
006,Maggy,22,Chennai

grunt> cogroup_data = COGROUP student_details1 by age, employee_details by age;

grunt> dump cogroup_data;
(21,{(1,Rajiv,Reddy,21,9848022337,Hyderabad),(4,Preethi,Agarwal,21,9848022330,Pune)},{})
(22,{(2,siddarth,Battacharya,22,9848022338,Kolkata),(3,Rajesh,Khanna,22,9848022339,Delhi)},{(1,Robin,22,newyork ),(6,Maggy,22,Chennai)})
(23,{(5,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar),(6,Archana,Mishra,23,9848022335,Chennai)},{(2,BOB,23,Kolkata ),(3,Maya,23,Tokyo ),(5,David,23,Bhuwaneshwar )})
(24,{(7,Komal,Nayak,24,9848022334,trivendram),(8,Bharathi,Nambiayar,24,9848022333,Chennai)},{})
(25,{},{(4,Sara,25,London )})

join_cogroup = FOREACH cogroup_data GENERATE group,FLATTEN(student_details1).

(25,{},{(4,Sara,25,London )}).I also need this record in the output of join_cogroup how to get that record?

Doubt 2:-

grunt> coustomer_orders = JOIN customers BY id, orders BY customer_id;

i want to do samething using cogroup+flatten.As per Pig textbook In fact, cogroup plus foreach, where each bag is flattened, is equivalent to a join—as long as there are no null values in the keys.

Tried below thing but not getting required output.

join_cogroup = FOREACH cogroup_data GENERATE group,FLATTEN(student_details1);
1 REPLY 1
Highlighted

Re: Pig flatten and cogroup clarification

Expert Contributor
Don't have an account?