Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Apache PIG - Script per Table to data cleansing

Solved Go to solution

Apache PIG - Script per Table to data cleansing

Explorer

Hi, I have four tables in .csv. All of them can be conected through a fact table (that are in .csv too). I wanna to do some data cleansing to this files and next put them into a Big Table in Have. But in Apache PIG should I've to create a script by table individually, or is better to join in PIG and then aplly some data cleansing in this normalized table? Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Apache PIG - Script per Table to data cleansing

Hi @João Souza

Personally, I'd create a script by each individual table. This way I can focus on the one table (if something changes) rather than modifying a larger script that encompasses all the tables (which would of course be more coding - creating a steeper learning curve for another developer).

View solution in original post

1 REPLY 1
Highlighted

Re: Apache PIG - Script per Table to data cleansing

Hi @João Souza

Personally, I'd create a script by each individual table. This way I can focus on the one table (if something changes) rather than modifying a larger script that encompasses all the tables (which would of course be more coding - creating a steeper learning curve for another developer).

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here