Hi,
I have four tables in .csv. All of them can be conected through a fact table (that are in .csv too). I wanna to do some data cleansing to this files and next put them into a Big Table in Have. But in Apache PIG should I've to create a script by table individually, or is better to join in PIG and then aplly some data cleansing in this normalized table?
Thanks!