<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Apache PIG - Script per Table to data cleansing in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Apache-PIG-Script-per-Table-to-data-cleansing/m-p/168835#M131153</link>
    <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/2985/antonio-scp125.html" nodeid="2985"&gt;@João Souza&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Personally, I'd create a script by each individual table.  This way I can focus on the one table (if something changes) rather than modifying a larger script that encompasses all the tables (which would of course be more coding - creating a steeper learning curve for another developer).&lt;/P&gt;</description>
    <pubDate>Tue, 09 Aug 2016 02:20:15 GMT</pubDate>
    <dc:creator>RyanCicak</dc:creator>
    <dc:date>2016-08-09T02:20:15Z</dc:date>
    <item>
      <title>Apache PIG - Script per Table to data cleansing</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Apache-PIG-Script-per-Table-to-data-cleansing/m-p/168834#M131152</link>
      <description>&lt;P&gt;Hi,

I have four tables in .csv. All of them can be conected through a fact table (that are in .csv too). I wanna to do some data cleansing to this files and next put them into a Big Table in Have. But in Apache PIG should I've to create a script by table individually, or is better to join in PIG and then aplly some data cleansing in this normalized table?

Thanks!&lt;/P&gt;</description>
      <pubDate>Mon, 08 Aug 2016 22:37:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Apache-PIG-Script-per-Table-to-data-cleansing/m-p/168834#M131152</guid>
      <dc:creator>prodgers125</dc:creator>
      <dc:date>2016-08-08T22:37:04Z</dc:date>
    </item>
    <item>
      <title>Re: Apache PIG - Script per Table to data cleansing</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Apache-PIG-Script-per-Table-to-data-cleansing/m-p/168835#M131153</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/2985/antonio-scp125.html" nodeid="2985"&gt;@João Souza&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Personally, I'd create a script by each individual table.  This way I can focus on the one table (if something changes) rather than modifying a larger script that encompasses all the tables (which would of course be more coding - creating a steeper learning curve for another developer).&lt;/P&gt;</description>
      <pubDate>Tue, 09 Aug 2016 02:20:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Apache-PIG-Script-per-Table-to-data-cleansing/m-p/168835#M131153</guid>
      <dc:creator>RyanCicak</dc:creator>
      <dc:date>2016-08-09T02:20:15Z</dc:date>
    </item>
  </channel>
</rss>

