Created 07-26-2017 10:11 AM
I have two relations
Relation A(1 w
2 x
3 y)
Relation B(
1 z
2 x
3 y
4 K
5 L)
Want to merge these relations in third relation with no duplicates using pig script
Relation C
1 w
2 x
3 y
4 K
5 L
Created 07-26-2017 06:35 PM
You could use an union followed by distinct
C = UNION A, B; D= DISTINCT C;
If there is a schema attached to the input relations, you may need the below
A = load 'file1.txt' using PigStorage(',') as (c1:chararray,c2:chararray); B = load 'file2.txt' using PigStorage(',') as (c1:chararray,c2:chararray); C = UNION A,B; D = GROUP C BY (c1,c2); E = FOREACH D GENERATE group.c1, group.c2; dump E;