Created 04-23-2016 02:44 AM
Hi,
This is my schema after cogroup:
C: {group: chararray,A: {(Name: chararray,Team: chararray,Positions: {T: (position: chararray)},Role: map[])},A1: {(Name1: chararray,Team1: chararray
,Points: int)}}And I would wanted to filter out C whose A1 is empty. Like the below record:
(Jake Fox,{(Jake Fox,Chicago Cubs,{(Infielder),(Catcher),(Outfielder),(First_baseman)},[hit_by_pitch#5,games#89,on_base_percentage#0.305,grand_slams#
1,home_runs#11,sacrifice_flies#6,at_bats#230,gdb#6,ibbs#1,base_on_balls#15,hits#58,rbis#45,slugging_percentage#0.457,batting_average#0.252,doubles#14
,runs#26,strikeouts#49])},{}) I tried nested foreach but it did not help. The output was empty bag:
Could someone post the query. Many Thanks!!!
Created 04-25-2016 11:43 PM
You can use IsEmpty to check if A1 is empty or not. Try something like this
grouped = COGROUP ..... ; filtered = FILTER grouped BY not IsEmpty($2); DUMP filtered;
Here's an example that shows how this work for something similar:
cat > owners.csv
adam,cat
adam,dog
alex,fish
david,horse
alice,cat
steve,dog
cat > pets.csv
nemo,fish
fido,dog
rex,dog
paws,cat
wiskers,cat
owners = LOAD 'owners.csv' USING PigStorage(',') AS (owner:chararray,animal:chararray);
pets = LOAD 'pets.csv' USING PigStorage(',') AS (name:chararray,animal:chararray);
grouped = COGROUP owners BY animal, pets by animal;
filtered = FILTER grouped BY not IsEmpty($2);
DUMP grouped;
(cat,{(alice,cat),(adam,cat)},{(wiskers,cat),(paws,cat)})
(dog,{(steve,dog),(adam,dog)},{(rex,dog),(fido,dog)})
(horse,{(david,horse)},{})
(fish,{(alex,fish)},{(nemo,fish)})
DUMP filtered;
(cat,{(alice,cat),(adam,cat)},{(wiskers,cat),(paws,cat)})
(dog,{(steve,dog),(adam,dog)},{(rex,dog),(fido,dog)})
(fish,{(alex,fish)},{(nemo,fish)})
					
				
			
			
				
			
			
			
				
			
			
			
			
			
		Created 04-25-2016 11:43 PM
You can use IsEmpty to check if A1 is empty or not. Try something like this
grouped = COGROUP ..... ; filtered = FILTER grouped BY not IsEmpty($2); DUMP filtered;
Here's an example that shows how this work for something similar:
cat > owners.csv
adam,cat
adam,dog
alex,fish
david,horse
alice,cat
steve,dog
cat > pets.csv
nemo,fish
fido,dog
rex,dog
paws,cat
wiskers,cat
owners = LOAD 'owners.csv' USING PigStorage(',') AS (owner:chararray,animal:chararray);
pets = LOAD 'pets.csv' USING PigStorage(',') AS (name:chararray,animal:chararray);
grouped = COGROUP owners BY animal, pets by animal;
filtered = FILTER grouped BY not IsEmpty($2);
DUMP grouped;
(cat,{(alice,cat),(adam,cat)},{(wiskers,cat),(paws,cat)})
(dog,{(steve,dog),(adam,dog)},{(rex,dog),(fido,dog)})
(horse,{(david,horse)},{})
(fish,{(alex,fish)},{(nemo,fish)})
DUMP filtered;
(cat,{(alice,cat),(adam,cat)},{(wiskers,cat),(paws,cat)})
(dog,{(steve,dog),(adam,dog)},{(rex,dog),(fido,dog)})
(fish,{(alex,fish)},{(nemo,fish)})
					
				
			
			
				
			
			
			
			
			
			
			
		Created 05-05-2016 05:46 PM
Hi @Revathy Mourouguessane, have you tried this solution ?
Created 05-06-2016 12:54 AM
Hi Abdel, I haven't tried this one. Used Join instead. I would try. Thank you.
 
					
				
				
			
		
