@Miro Ka
The issue i can see in your flow is QueryDatabase Table processor is running on All nodes but it supposed to run on only Primary node and if we are running all nodes then each node will get same data from the source table which resulting duplication of data once you stored into HDFS.
Change the Execution to Primary Node in Scheduling Tab

Once you changed to Primary Node your processor will show little P(as shown in the below screenshot) which indicates the processor running on primary node.

By making this change run again then you don't get any duplicate data.