Created 06-19-2023 03:06 AM
Hi All
@mattclarke,@mattburgess,@markpayne.
We are doing Postgres to Postgres data migration using NIFI. We are fetching the data from PostgresDb and putting it into PostgresDb,but while transferring records from source to destination, we are getting duplicate key errors. Only half of the rows are transferred (for example, if we have 300 million records, it is transfered only 150 million, after that, it is giving a duplicate key error). After researching about this issue, we found that this is because the sequence is out of sync, so as per the suggestion of the online community, we put the column name in the Maximum-value Columns property in the GenerateTableFetch processor (it automatically orders it in ascending order and helps in incremental fetching of records) After this, we are not getting the duplicate key error issue; nifi is successfully transferred 300 million records,but in production, we have 2k tables. How do we achieve this on all the tables? Also, we need more throughput in terms of the size of the data transfer (at least 1GB per minute). Please suggest good practises for achieving this problem at the organisational level.
this is the flow we implemented :-https://drive.google.com/file/d/1kGYe-H3Qpd5z3LBp7N31Zc1P7REDuoqa/view?usp=drivesdk
Thanks in advance.
Created 06-19-2023 11:56 PM
@iso8583, Welcome to our community! To help you get the best possible answer, I have tagged in our NiFi experts @MattWho @mburgess @SAMSAL who may be able to assist you further.
Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
Regards,
Vidya Sargur,Created 06-20-2023 12:16 AM
@cotopaul, tagging myself because I am struggling with a similar issue and was not quite able to figure it out myself ... and maybe I will get some hints from some of the answers.