<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: ReplaceTextWithMapping processor. De-duplicate only specific columns in a file in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/ReplaceTextWithMapping-processor-De-duplicate-only-specific/m-p/103859#M46252</link>
    <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/13945/rbalakrishnantce.html" nodeid="13945"&gt;@bala krishnan &lt;/A&gt;,&lt;/P&gt;&lt;P&gt;Not a solution but just to let you know that with the next version of NiFi (coming soon) you will be able to use ValidateCSV processor to achieve what you are looking for. In the meantime, I think that splitting the file is not going to help. Maybe trying something custom with ExecuteScript processor but probably not ideal.&lt;/P&gt;&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
    <pubDate>Wed, 16 Nov 2016 01:09:15 GMT</pubDate>
    <dc:creator>pvillard</dc:creator>
    <dc:date>2016-11-16T01:09:15Z</dc:date>
    <item>
      <title>ReplaceTextWithMapping processor. De-duplicate only specific columns in a file</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/ReplaceTextWithMapping-processor-De-duplicate-only-specific/m-p/103858#M46251</link>
      <description>&lt;P&gt;I have a CSV file with 9 columns. How can I remove duplicates among columns 4 through 9?&lt;/P&gt;&lt;P&gt;What we tried:&lt;/P&gt;&lt;P&gt;1. Split 1-4 columns in a file&lt;/P&gt;&lt;P&gt;2. Split 4-9 columns -&amp;gt; Deduplicate records&lt;/P&gt;&lt;P&gt;Now, i tried using 'ReplaceTextWithMapping' to merge the files with 4th column (Common on both files). But I am not sure if my approach is right.&lt;/P&gt;&lt;P&gt;Is there any other way to achieve this. &lt;/P&gt;</description>
      <pubDate>Tue, 15 Nov 2016 18:26:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/ReplaceTextWithMapping-processor-De-duplicate-only-specific/m-p/103858#M46251</guid>
      <dc:creator>rbalakrishnantc</dc:creator>
      <dc:date>2016-11-15T18:26:30Z</dc:date>
    </item>
    <item>
      <title>Re: ReplaceTextWithMapping processor. De-duplicate only specific columns in a file</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/ReplaceTextWithMapping-processor-De-duplicate-only-specific/m-p/103859#M46252</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/13945/rbalakrishnantce.html" nodeid="13945"&gt;@bala krishnan &lt;/A&gt;,&lt;/P&gt;&lt;P&gt;Not a solution but just to let you know that with the next version of NiFi (coming soon) you will be able to use ValidateCSV processor to achieve what you are looking for. In the meantime, I think that splitting the file is not going to help. Maybe trying something custom with ExecuteScript processor but probably not ideal.&lt;/P&gt;&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Nov 2016 01:09:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/ReplaceTextWithMapping-processor-De-duplicate-only-specific/m-p/103859#M46252</guid>
      <dc:creator>pvillard</dc:creator>
      <dc:date>2016-11-16T01:09:15Z</dc:date>
    </item>
  </channel>
</rss>

