I am just curious in your case how NiFi listening to the incoming transactional data?
If your requirement is to validate the incoming data before storing into HDFS, just wondering why do you need Kafka? Does ValidateRecord processor in NiFi not sufficient to do this task?
Hello @ramarov ,
Of course I cannot speak to the specific situation, but in these architectures kafka is typically seen as a buffer.
You will use NiFi to move the data, but before you start doing complicated streams you want the ability to easily buffer and re-play messages (for instance after something fails, or simlply after updating your analytics logic).
This is why NiFi typically pushes the messages to kafka, where they can be grabbed once or muliple times by engine like Spark Streaming.
(You mention validation, but that is not my best guess for why the data is moved through kafka).