Member since
06-19-2017
5
Posts
4
Kudos Received
0
Solutions
03-19-2018
07:58 AM
Hey Vincent, Your script should be the responsible of building those events 4KB size. You could build a buffer 4KB size and add your event inside of it. Definitely this is far from optimal as: (1) if the event is small you're wasting space and (2) you can't have events bigger than 4KB as it would need a more complex logic (split before sending and joining afterwards). I said that batch generation (managed by your Python script) and batch ingestion (handled by ExecuteProcess processor in NiFi) synchronisation is tricky, because each one belongs to different processes with their own internal timers triggering those time batches... processes synchronisation needs additional artefacts that will make things even more complicated. Definitely, don't go this way... I just mention it as an argument to justify applying your "workaround" which, to me, it's the way to go. Regarding the size of data, NiFi is optimised to handle any kind of data (varying from small to large pieces of data) as outlined in the documentation. I highly recommend you to go over our documentation to understand how data is managed internally, and you'll see there are smart design decisions made when it comes to dealing with content and the metadata linked to it 🙂 Kafka makes sense when you want to avoid overkilling a destination system which doesn't process data as fast as it is generated (for example). If you think Postgres won't cope with ingestion pace (probably you've already realised that NiFi is very performant), then add Kafka in the equation. Any way, keep in mind that queues and back pressure capabilities in NiFi could be just enough for your project. Best 😉
... View more