Created 09-23-2016 10:52 AM
Hi,
Just wondering how the cluster topology should look like for Kafka alongside Hadoop? I presume Kafka brokers shouldn't be co-located alongside data nodes. Instead should probably be installed on nodes outside Hadoop cluster (probably gateway / edge nodes) as Kafka serves as the landing area and the data be eventually pushed to one of the Hadoop storage engines.
Am I correct thinking this way? Please validate my understanding.
Created 09-23-2016 12:36 PM
This is a great guide to what gets installed where on HDP: https://community.hortonworks.com/articles/16763/cheat-sheet-and-tips-for-a-custom-install-of-horto....
You will notice that Kafka should be installed within the cluster and is best dedicated to its own nodes.
As a side note, Hortwonworks Data Flow (HDF) is a separate distribution/product provided by Hortonworks. It packages Kafka along with NiFi, Storm and Ambari and excels at acquiring, inspecting, routing, transforming, analyizing data in motion from a diverse number of sources (ranging from sensors to databases), which is typically outputted in Hadoop. Exciting technology and a lot to talk ... check it out: http://hortonworks.com/products/data-center/hdf/
Created 09-23-2016 12:36 PM
This is a great guide to what gets installed where on HDP: https://community.hortonworks.com/articles/16763/cheat-sheet-and-tips-for-a-custom-install-of-horto....
You will notice that Kafka should be installed within the cluster and is best dedicated to its own nodes.
As a side note, Hortwonworks Data Flow (HDF) is a separate distribution/product provided by Hortonworks. It packages Kafka along with NiFi, Storm and Ambari and excels at acquiring, inspecting, routing, transforming, analyizing data in motion from a diverse number of sources (ranging from sensors to databases), which is typically outputted in Hadoop. Exciting technology and a lot to talk ... check it out: http://hortonworks.com/products/data-center/hdf/