Support Questions
Find answers, ask questions, and share your expertise

Extract timestamp from filename and add it in new column(say,date) by using Pig

Solved Go to solution

Extract timestamp from filename and add it in new column(say,date) by using Pig

New Contributor

I have a file with name YYYYMMDD_claims_portal.csv, i need only YYYYMMDD part and store this value in new column(say,date). Earlier we have 3 column like Claim,User,ID. now i need to add one more column date having value as YYYYMMDD as per file. Please help, its bit urgent.

Thanks in advance for any help you guys can provide.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Extract timestamp from filename and add it in new column(say,date) by using Pig

Expert Contributor

@Sumee singh

Please try this:

A = LOAD 'YYYYMMDD_claims_portal.csv' using PigStorage(',','-tagFile');

y = FOREACH A GENERATE SUBSTRING($0,0,8),$1..;

DUMP y;

(Input file name comes as the first field in tuple). You can modify after this as you wish.

View solution in original post

4 REPLIES 4

Re: Extract timestamp from filename and add it in new column(say,date) by using Pig

Expert Contributor

@Sumee singh

Please try this:

A = LOAD 'YYYYMMDD_claims_portal.csv' using PigStorage(',','-tagFile');

y = FOREACH A GENERATE SUBSTRING($0,0,8),$1..;

DUMP y;

(Input file name comes as the first field in tuple). You can modify after this as you wish.

View solution in original post

Re: Extract timestamp from filename and add it in new column(say,date) by using Pig

New Contributor

@tsharma

Thanks for your prompt reply..i'll try this approach but by-tagFile we tagged file name with all the column name, here what i want is to create a new column like date and store the file name in it..

Thank you.

Re: Extract timestamp from filename and add it in new column(say,date) by using Pig

Expert Contributor

Ok, do this:-

A = LOAD 'YYYYMMDD_claims_portal.csv' using PigStorage(',','-tagFile') AS (filename:chararray, {other columns as per your schema})

y = FOREACH A GENERATE $1..,SUBSTRING(filename,0,8) AS day;

describe y;

DUMP y;

Re: Extract timestamp from filename and add it in new column(say,date) by using Pig

New Contributor

Thanks @tsharma.. This works.. Thank you :)