Created 11-23-2016 08:49 AM
1)I have two queries on flatten and cogroup
Doubt 1:
student_details.txt 001,Rajiv,Reddy,21,9848022337,Hyderabad 002,siddarth,Battacharya,22,9848022338,Kolkata 003,Rajesh,Khanna,22,9848022339,Delhi 004,Preethi,Agarwal,21,9848022330,Pune 005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar 006,Archana,Mishra,23,9848022335,Chennai 007,Komal,Nayak,24,9848022334,trivendram 008,Bharathi,Nambiayar,24,9848022333,Chennai
employee_details.txt 001,Robin,22,newyork 002,BOB,23,Kolkata 003,Maya,23,Tokyo 004,Sara,25,London 005,David,23,Bhuwaneshwar 006,Maggy,22,Chennai
grunt> cogroup_data = COGROUP student_details1 by age, employee_details by age;
grunt> dump cogroup_data; (21,{(1,Rajiv,Reddy,21,9848022337,Hyderabad),(4,Preethi,Agarwal,21,9848022330,Pune)},{}) (22,{(2,siddarth,Battacharya,22,9848022338,Kolkata),(3,Rajesh,Khanna,22,9848022339,Delhi)},{(1,Robin,22,newyork ),(6,Maggy,22,Chennai)}) (23,{(5,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar),(6,Archana,Mishra,23,9848022335,Chennai)},{(2,BOB,23,Kolkata ),(3,Maya,23,Tokyo ),(5,David,23,Bhuwaneshwar )}) (24,{(7,Komal,Nayak,24,9848022334,trivendram),(8,Bharathi,Nambiayar,24,9848022333,Chennai)},{}) (25,{},{(4,Sara,25,London )}) join_cogroup = FOREACH cogroup_data GENERATE group,FLATTEN(student_details1).
(25,{},{(4,Sara,25,London )}).I also need this record in the output of join_cogroup how to get that record?
Doubt 2:-
grunt> coustomer_orders = JOIN customers BY id, orders BY customer_id;
i want to do samething using cogroup+flatten.As per Pig textbook In fact, cogroup plus foreach, where each bag is flattened, is equivalent to a join—as long as there are no null values in the keys.
Tried below thing but not getting required output.
join_cogroup = FOREACH cogroup_data GENERATE group,FLATTEN(student_details1);
Created 11-24-2016 09:26 AM
Hi Vamsi,
Please refer bleow links.
http://stackoverflow.com/questions/18544602/how-to-flatten-a-group-into-a-single-tuple-in-pig
https://pig.apache.org/docs/r0.7.0/piglatin_ref2.html
Thanks,
Mahesh