Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

where to install kafka?

avatar
Rising Star

Hi,

Just wondering how the cluster topology should look like for Kafka alongside Hadoop? I presume Kafka brokers shouldn't be co-located alongside data nodes. Instead should probably be installed on nodes outside Hadoop cluster (probably gateway / edge nodes) as Kafka serves as the landing area and the data be eventually pushed to one of the Hadoop storage engines.

Am I correct thinking this way? Please validate my understanding.

1 ACCEPTED SOLUTION

avatar
Guru

This is a great guide to what gets installed where on HDP: https://community.hortonworks.com/articles/16763/cheat-sheet-and-tips-for-a-custom-install-of-horto....

You will notice that Kafka should be installed within the cluster and is best dedicated to its own nodes.

As a side note, Hortwonworks Data Flow (HDF) is a separate distribution/product provided by Hortonworks. It packages Kafka along with NiFi, Storm and Ambari and excels at acquiring, inspecting, routing, transforming, analyizing data in motion from a diverse number of sources (ranging from sensors to databases), which is typically outputted in Hadoop. Exciting technology and a lot to talk ... check it out: http://hortonworks.com/products/data-center/hdf/

View solution in original post

1 REPLY 1

avatar
Guru

This is a great guide to what gets installed where on HDP: https://community.hortonworks.com/articles/16763/cheat-sheet-and-tips-for-a-custom-install-of-horto....

You will notice that Kafka should be installed within the cluster and is best dedicated to its own nodes.

As a side note, Hortwonworks Data Flow (HDF) is a separate distribution/product provided by Hortonworks. It packages Kafka along with NiFi, Storm and Ambari and excels at acquiring, inspecting, routing, transforming, analyizing data in motion from a diverse number of sources (ranging from sensors to databases), which is typically outputted in Hadoop. Exciting technology and a lot to talk ... check it out: http://hortonworks.com/products/data-center/hdf/