Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Apache PIG - Script per Table to data cleansing

avatar
Rising Star

Hi, I have four tables in .csv. All of them can be conected through a fact table (that are in .csv too). I wanna to do some data cleansing to this files and next put them into a Big Table in Have. But in Apache PIG should I've to create a script by table individually, or is better to join in PIG and then aplly some data cleansing in this normalized table? Thanks!

1 ACCEPTED SOLUTION

avatar

Hi @João Souza

Personally, I'd create a script by each individual table. This way I can focus on the one table (if something changes) rather than modifying a larger script that encompasses all the tables (which would of course be more coding - creating a steeper learning curve for another developer).

View solution in original post

1 REPLY 1

avatar

Hi @João Souza

Personally, I'd create a script by each individual table. This way I can focus on the one table (if something changes) rather than modifying a larger script that encompasses all the tables (which would of course be more coding - creating a steeper learning curve for another developer).