Created 09-20-2017 02:39 PM
I have a file with name YYYYMMDD_claims_portal.csv, i need only YYYYMMDD part and store this value in new column(say,date). Earlier we have 3 column like Claim,User,ID. now i need to add one more column date having value as YYYYMMDD as per file. Please help, its bit urgent.
Thanks in advance for any help you guys can provide.
Created 09-20-2017 02:59 PM
Please try this:
A = LOAD 'YYYYMMDD_claims_portal.csv' using PigStorage(',','-tagFile');
y = FOREACH A GENERATE SUBSTRING($0,0,8),$1..;
DUMP y;
(Input file name comes as the first field in tuple). You can modify after this as you wish.
Created 09-20-2017 02:59 PM
Please try this:
A = LOAD 'YYYYMMDD_claims_portal.csv' using PigStorage(',','-tagFile');
y = FOREACH A GENERATE SUBSTRING($0,0,8),$1..;
DUMP y;
(Input file name comes as the first field in tuple). You can modify after this as you wish.
Created 09-21-2017 01:43 AM
Thanks for your prompt reply..i'll try this approach but by-tagFile we tagged file name with all the column name, here what i want is to create a new column like date and store the file name in it..
Thank you.
Created 09-21-2017 06:59 AM
Ok, do this:-
A = LOAD 'YYYYMMDD_claims_portal.csv' using PigStorage(',','-tagFile') AS (filename:chararray, {other columns as per your schema})
y = FOREACH A GENERATE $1..,SUBSTRING(filename,0,8) AS day;
describe y;
DUMP y;
Created 09-24-2017 04:12 AM
Thanks @tsharma.. This works.. Thank you 🙂