Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

PIG: CONCAT A relation OUTPUT to another RELATION

SOLVED Go to solution

PIG: CONCAT A relation OUTPUT to another RELATION

New Contributor

Sorry for the wrong phrasing of question. I am new to clouderaas well as I am completely new to PIG and trying to experiment on my own.

 

I have a scenario where to process the words.t file and data.txt file.

 

words.txt

word1
word2
word3
word4

data.txt

{"created_at":"18:47:31,Sun Sep 30 2012","text":"RT @Joey7Barton: ..give a word1 about whether the americans wins a Ryder cup. I mean surely he has slightly more important matters. #fami ...","user_id":450990391,"id":252479809098223616}

I need to get the output as

(word1_epochtime){complete data which matched in text attribute}

i.e

(word1_1234567890){{"created_at":"18:47:31,Sun Sep 30 2012","text":"RT @Joey7Barton: ..give a word1 about whether the americans wins a Ryder cup. I mean surely he has slightly more important matters. #fami ...","user_id":450990391,"id":252479809098223616}

I have got the ouput as

(word1){"created_at":"18:47:31,Sun Sep 30 2012","text":"RT @Joey7Barton: ..give a word1 about whether the americans wins a Ryder cup. I mean surely he has slightly more important matters. #fami ...","user_id":450990391,"id":252479809098223616}

by using this script.

load words.txt
load data.txt
c = cross words,data;
d = FILTER c BY (data::text MATCHES CONCAT(CONCAT('.*',words::word),'.*'));
e =  foreach (group d BY word) {data);

and I got the epochtime with the words as

time = FOREACH words GENERATE CONCAT(CONCAT(word,'_'),(chararray)ToUnixTime(CurrentTime(created_at)));

But I am unable to CONCAT the words with time.

How can i get the output as

(word1_epochtime){data}

Please feel free to suggest me for the above.

 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: PIG: CONCAT A relation OUTPUT to another RELATION

New Contributor

I think i got it on my own.

 

here is the script that i have written

 

res = FILTER c BY (data::text MATCHES CONCAT(CONCAT('.*',words::word),'.*'));
epoch = FOREACH res GENERATE CONCAT(CONCAT(CONCAT(word,'_'),(chararray)ToUnixTime(CurrentTime(created_at))) as epochtime;
res1= foreach (group epoch by epochtime){data}
dump res1;

 

 
1 REPLY 1

Re: PIG: CONCAT A relation OUTPUT to another RELATION

New Contributor

I think i got it on my own.

 

here is the script that i have written

 

res = FILTER c BY (data::text MATCHES CONCAT(CONCAT('.*',words::word),'.*'));
epoch = FOREACH res GENERATE CONCAT(CONCAT(CONCAT(word,'_'),(chararray)ToUnixTime(CurrentTime(created_at))) as epochtime;
res1= foreach (group epoch by epochtime){data}
dump res1;