Support Questions

Find answers, ask questions, and share your expertise

Apache Pig scrips alias name usage

avatar
New Contributor

I am referring one basic pig script to count number of words as follows:

A = load 'Desktop/wordcount.txt' as (col1:chararray);
B = foreach A generate flatten(TOKENIZE(col1)) as word;
grouped = group B by word;
cnt = foreach grouped generate B, COUNT(B);
dump cnt;

And using wordcount.txt file with contents as below:

This is pig test    
This is pig test

word in the above code is used as an alias for foreach A generate flatten(TOKENIZE(col1)).
Not able to exactly understand the use of below line
grouped = group B by word
and the role of alias over here.

1 REPLY 1

avatar
  1. A = load 'Desktop/wordcount.txt'as(col1:chararray);
  2. B = foreach A generate flatten(TOKENIZE(col1))as (word:chararray);
  3. C = group B by word;
  4. cnt = foreach C generate flatten(group), COUNT(B.word);
  5. dump cnt;