Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Pig flatten and cogroup clarification

Pig flatten and cogroup clarification

Contributor

1)I have two queries on flatten and cogroup

Doubt 1:

student_details.txt
001,Rajiv,Reddy,21,9848022337,Hyderabad
002,siddarth,Battacharya,22,9848022338,Kolkata
003,Rajesh,Khanna,22,9848022339,Delhi
004,Preethi,Agarwal,21,9848022330,Pune
005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
006,Archana,Mishra,23,9848022335,Chennai
007,Komal,Nayak,24,9848022334,trivendram
008,Bharathi,Nambiayar,24,9848022333,Chennai
employee_details.txt
001,Robin,22,newyork 
002,BOB,23,Kolkata 
003,Maya,23,Tokyo 
004,Sara,25,London 
005,David,23,Bhuwaneshwar 
006,Maggy,22,Chennai

grunt> cogroup_data = COGROUP student_details1 by age, employee_details by age;

grunt> dump cogroup_data;
(21,{(1,Rajiv,Reddy,21,9848022337,Hyderabad),(4,Preethi,Agarwal,21,9848022330,Pune)},{})
(22,{(2,siddarth,Battacharya,22,9848022338,Kolkata),(3,Rajesh,Khanna,22,9848022339,Delhi)},{(1,Robin,22,newyork ),(6,Maggy,22,Chennai)})
(23,{(5,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar),(6,Archana,Mishra,23,9848022335,Chennai)},{(2,BOB,23,Kolkata ),(3,Maya,23,Tokyo ),(5,David,23,Bhuwaneshwar )})
(24,{(7,Komal,Nayak,24,9848022334,trivendram),(8,Bharathi,Nambiayar,24,9848022333,Chennai)},{})
(25,{},{(4,Sara,25,London )})

join_cogroup = FOREACH cogroup_data GENERATE group,FLATTEN(student_details1).

(25,{},{(4,Sara,25,London )}).I also need this record in the output of join_cogroup how to get that record?

Doubt 2:-

grunt> coustomer_orders = JOIN customers BY id, orders BY customer_id;

i want to do samething using cogroup+flatten.As per Pig textbook In fact, cogroup plus foreach, where each bag is flattened, is equivalent to a join—as long as there are no null values in the keys.

Tried below thing but not getting required output.

join_cogroup = FOREACH cogroup_data GENERATE group,FLATTEN(student_details1);
1 REPLY 1
Highlighted

Re: Pig flatten and cogroup clarification

Expert Contributor
Don't have an account?
Coming from Hortonworks? Activate your account here