<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Help me proceed - which way to go? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Help-me-proceed-which-way-to-go/m-p/313524#M225573</link>
    <description>&lt;P&gt;Hello everyone,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need your help because I do not know how to proceed.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Right now I have a PostgreSQL database with the following table:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;domains&lt;/P&gt;&lt;P&gt;domain, source, timestamp,&lt;/P&gt;&lt;P&gt;domainA, yourdomains, 128989372&lt;/P&gt;&lt;P&gt;domainB, yourdomaisn, 128923892&lt;/P&gt;&lt;P&gt;domainA, cyberclub, 13934829&lt;/P&gt;&lt;P&gt;domainD, cyberclub, 184994420&lt;/P&gt;&lt;P&gt;domainA, securityTeam, 118382938&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My goal is to make some comparisons and alter the table. The most important one is to check every line for duplicates in the table in column "domain" like the first, the third and the last line and compare their timestamps. The one with the lowest timestamp gets a new column with the number 1. The next one gets 2 ... At the end I should see which source has how many ones.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Which tool should I use? I got Apache Flink or Spark recommended? Or another SQL Tool? Or plain SQL with scripts? I am happy for every tip!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Best regards&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Maurice&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;</description>
    <pubDate>Sun, 21 Mar 2021 20:07:09 GMT</pubDate>
    <dc:creator>Mandrill</dc:creator>
    <dc:date>2021-03-21T20:07:09Z</dc:date>
  </channel>
</rss>

