Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Help me proceed - which way to go?

avatar
Explorer

Hello everyone, 

 

I need your help because I do not know how to proceed. 

 

Right now I have a PostgreSQL database with the following table:

 

domains

domain, source, timestamp,

domainA, yourdomains, 128989372

domainB, yourdomaisn, 128923892

domainA, cyberclub, 13934829

domainD, cyberclub, 184994420

domainA, securityTeam, 118382938

 

 

My goal is to make some comparisons and alter the table. The most important one is to check every line for duplicates in the table in column "domain" like the first, the third and the last line and compare their timestamps. The one with the lowest timestamp gets a new column with the number 1. The next one gets 2 ... At the end I should see which source has how many ones.

 

 

Which tool should I use? I got Apache Flink or Spark recommended? Or another SQL Tool? Or plain SQL with scripts? I am happy for every tip! 

 

Best regards 

 

Maurice 

 
1 REPLY 1

avatar
Expert Contributor

Looks like a programming question, you can try HashMap and use the domain value as key, then make your comparison and update the count

 

Here is one example: https://stackoverflow.com/questions/46169229/how-to-count-the-number-of-unique-values-in-hashmap/461...