Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Extract timestamp from filename and add it in new column(say,date) by using Pig

avatar
Explorer

I have a file with name YYYYMMDD_claims_portal.csv, i need only YYYYMMDD part and store this value in new column(say,date). Earlier we have 3 column like Claim,User,ID. now i need to add one more column date having value as YYYYMMDD as per file. Please help, its bit urgent.

Thanks in advance for any help you guys can provide.

1 ACCEPTED SOLUTION

avatar
Super Collaborator
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
4 REPLIES 4

avatar
Super Collaborator
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Explorer

@tsharma

Thanks for your prompt reply..i'll try this approach but by-tagFile we tagged file name with all the column name, here what i want is to create a new column like date and store the file name in it..

Thank you.

avatar
Super Collaborator

Ok, do this:-

A = LOAD 'YYYYMMDD_claims_portal.csv' using PigStorage(',','-tagFile') AS (filename:chararray, {other columns as per your schema})

y = FOREACH A GENERATE $1..,SUBSTRING(filename,0,8) AS day;

describe y;

DUMP y;

avatar
Explorer

Thanks @tsharma.. This works.. Thank you 🙂