Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Pig Script to insert new column based on filename

Pig Script to insert new column based on filename

Explorer
 
3 REPLIES 3
Highlighted

Re: Pig Script to insert new column based on filename

@João Souza

Can you please check the home directory of the user who is running the pig script?

Usually, we will have a pig log like /root/pig_1465986765983.log.

Thanks and Regards,

Sindhu

Highlighted

Re: Pig Script to insert new column based on filename

Explorer

Hi Sindhu, Yes, the directory is right. The files are inserted into: user -> cloudera -> Analytics (folder created) -> source (folder created) Do you think the script do what I want?

Highlighted

Re: Pig Script to insert new column based on filename

New Contributor

@João Souza

PigStorage has a column INPUT_FILE_NAME, which indicates the input file name of that input by specifying -tagFile

i/p:

1,a

2,b

3,c

o/p:

pigtest.txt,1,a

pigtest.txt,2,b

pigtest.txt,3,d

A = load '/pigtest/pigtest.txt' using PigStorage(',', '-tagFile'); /* The first column of the output will be INPUT_FILE_NAME */

B = FOREACH A GENERATE $0,$1,$2;

DUMP B;

Please let me know if it is helpful.

Regards,

Ravi.

Don't have an account?
Coming from Hortonworks? Activate your account here