Kafka Message validation


Hi Guys,

In my project I need to validate the Kafka messages. I mean I want to validate the qualify of the data from kafka message. ex: I need to get the log messages to the source of my analytics application. I need to know whether the message I received as it is from kafka. Is there any tool (or) way to validate it?


Cloudera Employee

Hi Mahendiran -

Thanks for posting!

If I can restate your question a different way, you're asking for a way to track where your data came from and how it got there. Is that correct (if so, we call it 'data lineage' if you search more). There are tools on top of the HDP / HDF stack that will support lineage.

If you were to diagram your flow:

Data producer (x) -> Kafka Server Topic (y) -> Analytics Application (z).

In this example, you're trying to verify that data that's in your analytics application (z) comes from Kafka (y)?

If not, can you correct my current understanding?

