Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can Kafka be used for events processing?

Highlighted

Can Kafka be used for events processing?

Super Collaborator

Hi guys,

Wanted to know how good Kafka is to capture events at various states during the internal data processing. That info can be used for auditing or reporting purpose. Suppose the data consumption has been started and I want to know the number of input records processed and the number of records loaded in Hive. In Hive, there is some kind of enrichment going on. I want to know how many records got enriched.

Plan is to load them eventually into HBase. Also, message volumes would be very low at this point of time. Just want to decouple these tasks from other framework jobs. Please let me know if you have ever come across the idea of using pub/sub in this kind of scenario.

4 REPLIES 4
Highlighted

Re: Can Kafka be used for events processing?

Rising Star

You can use the consumer group information (offsets) from Kafka to inform you on how much data has been processed. This information is fairly reliable to be used for reporting purposes.

Please accept this answer if it helped you.

Highlighted

Re: Can Kafka be used for events processing?

Super Collaborator

Thanks @Ambud Sharma for your reply. The actual data will not be processed or sent to Kafka brokers. It would be there in Hive only. I want some specific information for the event like:

1. when did the job start

2. did it get fail due to some exception

3. if not, then how many records got processed

Want to send this data to Kafka. I could use simple log4j logging along with Splunk but would like to stay within HDP stack. Please let me know how you feel about this.

Highlighted

Re: Can Kafka be used for events processing?

Rising Star

You can use a log4j appender for Kafka: https://logging.apache.org/log4j/2.x/manual/appenders.html#KafkaAppender

Another option could be to use Atlas hook: http://atlas.apache.org/Bridge-Hive.html

Highlighted

Re: Can Kafka be used for events processing?

Super Collaborator

Hi @Rafael Coss, any comments on this??

Don't have an account?
Coming from Hortonworks? Activate your account here