Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

PIG: CONCAT A relation OUTPUT to another RELATION

avatar
Explorer

Sorry for the wrong phrasing of question. I am new to clouderaas well as I am completely new to PIG and trying to experiment on my own.

 

I have a scenario where to process the words.t file and data.txt file.

 

words.txt

word1
word2
word3
word4

data.txt

{"created_at":"18:47:31,Sun Sep 30 2012","text":"RT @Joey7Barton: ..give a word1 about whether the americans wins a Ryder cup. I mean surely he has slightly more important matters. #fami ...","user_id":450990391,"id":252479809098223616}

I need to get the output as

(word1_epochtime){complete data which matched in text attribute}

i.e

(word1_1234567890){{"created_at":"18:47:31,Sun Sep 30 2012","text":"RT @Joey7Barton: ..give a word1 about whether the americans wins a Ryder cup. I mean surely he has slightly more important matters. #fami ...","user_id":450990391,"id":252479809098223616}

I have got the ouput as

(word1){"created_at":"18:47:31,Sun Sep 30 2012","text":"RT @Joey7Barton: ..give a word1 about whether the americans wins a Ryder cup. I mean surely he has slightly more important matters. #fami ...","user_id":450990391,"id":252479809098223616}

by using this script.

load words.txt
load data.txt
c = cross words,data;
d = FILTER c BY (data::text MATCHES CONCAT(CONCAT('.*',words::word),'.*'));
e =  foreach (group d BY word) {data);

and I got the epochtime with the words as

time = FOREACH words GENERATE CONCAT(CONCAT(word,'_'),(chararray)ToUnixTime(CurrentTime(created_at)));

But I am unable to CONCAT the words with time.

How can i get the output as

(word1_epochtime){data}

Please feel free to suggest me for the above.

 

Thank you.

1 ACCEPTED SOLUTION

avatar
Explorer

I think i got it on my own.

 

here is the script that i have written

 

res = FILTER c BY (data::text MATCHES CONCAT(CONCAT('.*',words::word),'.*'));
epoch = FOREACH res GENERATE CONCAT(CONCAT(CONCAT(word,'_'),(chararray)ToUnixTime(CurrentTime(created_at))) as epochtime;
res1= foreach (group epoch by epochtime){data}
dump res1;

 

 

View solution in original post

1 REPLY 1

avatar
Explorer

I think i got it on my own.

 

here is the script that i have written

 

res = FILTER c BY (data::text MATCHES CONCAT(CONCAT('.*',words::word),'.*'));
epoch = FOREACH res GENERATE CONCAT(CONCAT(CONCAT(word,'_'),(chararray)ToUnixTime(CurrentTime(created_at))) as epochtime;
res1= foreach (group epoch by epochtime){data}
dump res1;