Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Optimization nifi delete

Explorer

Hello,

I have one big table in one database that I have to sink on destination.

Firs I have flow with incremental insert and it works fine. Than in other flow I have incremental delete by ID column.

I am trying to delete 3179958 records from table.  I have one table on source and I have to do increment delete on destination.  This is my flow.

LejlaKM_0-1644317370284.png

 

I am sending every processor separate:

  1. generatetablefetch:LejlaKM_1-1644314568905.png
  2. execute sql: 

    LejlaKM_2-1644314619744.png

  3. putdatabaserecord - delete

LejlaKM_3-1644314687573.png

 

When I have much less records it all works fine. But now flow can't finished... Is there any other solution.

 

Thanks in advance

2 ACCEPTED SOLUTIONS

Master Collaborator

Hi, @LejlaKM ,

 

Here's some guidelines that should help with performance:

  • On the database side, ensure that both tables (source and target) have a primary key enforced by an index to guarantee uniqueness and delete performance
  • GenerateTableFetch
    • Columns to Return: specify only the primary key column (if the primary key is a composite key, specify the list of key columns separated by commas)
    • Maximum-value Columns: If the primary key increases monotonically, list the primary key column in this property too. If not, you should look for an alternative column that increases monotonically to specify here. Without this performance can be terrible.
    • Column for Value Partitioning: same as for "Maximum-value Columns"

Please let me know if this helps.

 

Regards,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

View solution in original post

Explorer

Hi,

 

thank you very very much...

Delete for 3179958 records works now 6 minutes

View solution in original post

5 REPLIES 5

Master Collaborator

What's the underlying database? Could you please share the full configuration of the PutDatabaseRecord processor?

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer

Hello,

 

it is oracle database

LejlaKM_0-1644390603316.png

 

Master Collaborator
Can you please provide the full configuration of PutDatabaseRecord?
--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Master Collaborator

Hi, @LejlaKM ,

 

Here's some guidelines that should help with performance:

  • On the database side, ensure that both tables (source and target) have a primary key enforced by an index to guarantee uniqueness and delete performance
  • GenerateTableFetch
    • Columns to Return: specify only the primary key column (if the primary key is a composite key, specify the list of key columns separated by commas)
    • Maximum-value Columns: If the primary key increases monotonically, list the primary key column in this property too. If not, you should look for an alternative column that increases monotonically to specify here. Without this performance can be terrible.
    • Column for Value Partitioning: same as for "Maximum-value Columns"

Please let me know if this helps.

 

Regards,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer

Hi,

 

thank you very very much...

Delete for 3179958 records works now 6 minutes

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.