Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

I know flume is collect data, kafka publish data, can kafka work alone to collect and publish data?

Highlighted

I know flume is collect data, kafka publish data, can kafka work alone to collect and publish data?

Contributor

Our new project is going to collect internet search data and move to mongodb in realtime and streaming.

so we are going to install kafka, not sure if it is mandatory to have flume for READ/COLLECT data from internet, Can kafka work alone to get data?

In the mongodb env. seemed the mongodb sink will take care of WRITE data from kafka to mongodb.

Any one have any ideas?

2 REPLIES 2

Re: I know flume is collect data, kafka publish data, can kafka work alone to collect and publish data?

Hi @Robin Dong you may want to take a look at NiFi which is often used infront of kafka to simplify the ingest from a variety of sources into many other platforms including Kafka.

Advantages include the fact that NiFi flows can be created by drag and drop rather than having to do everything at the code level with Kafka.

For a starter in this area, take a look at https://hortonworks.com/webinar/apache-kafka-apache-nifi-better-together/

Hope that helps!

Re: I know flume is collect data, kafka publish data, can kafka work alone to collect and publish data?

Contributor

Thank you so much Dave.

after some search, we may have to use kafka producer API for this complex data streaming. However, your answer give me a new thought on nifi with kafka.

Thank you very much for your time to help.

Robin