Support Questions

Find answers, ask questions, and share your expertise

Pig converting tuple to bag

avatar
Expert Contributor

I have grouped data as shown below:

group_data = GROUP student_details by age;
dump group_data;
(21,{(4,Preethi,Agarwal,21,9848022330,Pune),(1,Rajiv,Reddy,21,9848022337,Hydera bad)})
(22,{(3,Rajesh,Khanna,22,9848022339,Delhi),(2,siddarth,Battacharya,22,984802233 8,Kolkata)})
(23,{(6,Archana,Mishra,23,9848022335,Chennai),(5,Trupthi,Mohanthy,23,9848022336 ,Bhuwaneshwar)})
(24,{(8,Bharathi,Nambiayar,24,9848022333,Chennai),(7,Komal,Nayak,24,9848022334, trivendram)})


Required output from group_data is as below:

(23,{(6,Archana,Mishra,23,9848022335,Chennai)})
(24,{(8,Bharathi,Nambiayar,24,9848022333,Chennai)})

Tried below thing:

First Try:-

t = foreach group_data generate group, FLATTEN(student_details1);
filter_data = FILTER t BY student_details::city == 'Chennai';
(23,6,Archana,Mishra,23,9848022335,Chennai)
(24,8,Bharathi,Nambiayar,24,9848022333,Chennai)


Second Try:-

t_1 = FOREACH group_data {
    t_2 = FILTER student_details BY city == 'Chennai';
    GENERATE group,t_2 AS failTime;
};

(21,{})
(22,{})
(23,{(6,Archana,Mishra,23,9848022335,Chennai)})
(24,{(8,Bharathi,Nambiayar,24,9848022333,Chennai)})
1 ACCEPTED SOLUTION

avatar
Expert Contributor

I got the answer as below:

isempty_data_1 = filter t_1 by SIZE(failTime)>0;

(23,{(6,Archana,Mishra,23,9848022335,Chennai)})

(24,{(8,Bharathi,Nambiayar,24,9848022333,Chennai)})

View solution in original post

1 REPLY 1

avatar
Expert Contributor

I got the answer as below:

isempty_data_1 = filter t_1 by SIZE(failTime)>0;

(23,{(6,Archana,Mishra,23,9848022335,Chennai)})

(24,{(8,Bharathi,Nambiayar,24,9848022333,Chennai)})