Created 07-31-2017 02:26 PM
Hello,
What are the best practices to install Kafka/NiFi ?
1/ Should Kafka brokers be located within the same data nodes, or should they be on separate nodes? Which way is better in term of performance ? Is it possible to have Kafka on a datanode when Kafka is installed using HDF?
2/ Can Kafka and NiFi shares the same zookeeper or should Kafka have its own ZK used exclusively by Kafka?
3/ Does the installation of NiFi by HDF(ambari) apply the needed system requirements such as max files handles, max forked processes ...or should theses requirement be done before proceeding to the installation by ambari?
4/ Is it possible to have a node that belong to both a HDF and HDP clusters at the same time with same ambari agent running on the node ?
Thanks in advance !
Created 07-31-2017 03:02 PM
Please see replies inline below:
1/ Should Kafka brokers be located within the same data nodes, or should they be on separate nodes? Which way is better in term of performance ? Is it possible to have Kafka on a datanode when Kafka is installed using HDF?
Is this for production? Before answering I would suggest you engage someone from you local Hortonworks Account team to help you answer these questions.
Depending on your data ingest, you might need dedicated Kafka servers. In other cases you may co locate Kafka on data nodes (rarely happens in production unless its something very small). Even when you co locate Kafka on data nodes, make sure you give it dedicated nodes and its own Zookeeper. Kafka must have its own Zookeeper. Also Zookeeper should have its own disks. Not large capacity disks but its own disks.
2/ Can Kafka and NiFi shares the same zookeeper or should Kafka have its own ZK used exclusively by Kafka?
Ideally you don't want Zookeeper to be shared. Kafka should get its own Zookeeper. That being said, in my personal opinion, sharing Zookeeper with Nifi will be okay. Just don't add any new component beyond these two to Zookeeper dedicated for Kafka.
3/ Does the installation of NiFi by HDF(ambari) apply the needed system requirements such as max files handles, max forked processes ...or should theses requirement be done before proceeding to the installation by ambari?
No, when Ambari is managing Nifi, it enables you to configure Nifi. IT is not going to make OS level changes. Imagine you make OS level changes from Nifi which affects everything else on that server. You don't want that.
4/ Is it possible to have a node that belong to both a HDF and HDP clusters at the same time with same ambari agent running on the node ?
Two things here. New version of Ambari manages both HDP and HDF. And yes you can install HDF services on HDP cluster. Please see the following link.
Created 07-31-2017 03:02 PM
Please see replies inline below:
1/ Should Kafka brokers be located within the same data nodes, or should they be on separate nodes? Which way is better in term of performance ? Is it possible to have Kafka on a datanode when Kafka is installed using HDF?
Is this for production? Before answering I would suggest you engage someone from you local Hortonworks Account team to help you answer these questions.
Depending on your data ingest, you might need dedicated Kafka servers. In other cases you may co locate Kafka on data nodes (rarely happens in production unless its something very small). Even when you co locate Kafka on data nodes, make sure you give it dedicated nodes and its own Zookeeper. Kafka must have its own Zookeeper. Also Zookeeper should have its own disks. Not large capacity disks but its own disks.
2/ Can Kafka and NiFi shares the same zookeeper or should Kafka have its own ZK used exclusively by Kafka?
Ideally you don't want Zookeeper to be shared. Kafka should get its own Zookeeper. That being said, in my personal opinion, sharing Zookeeper with Nifi will be okay. Just don't add any new component beyond these two to Zookeeper dedicated for Kafka.
3/ Does the installation of NiFi by HDF(ambari) apply the needed system requirements such as max files handles, max forked processes ...or should theses requirement be done before proceeding to the installation by ambari?
No, when Ambari is managing Nifi, it enables you to configure Nifi. IT is not going to make OS level changes. Imagine you make OS level changes from Nifi which affects everything else on that server. You don't want that.
4/ Is it possible to have a node that belong to both a HDF and HDP clusters at the same time with same ambari agent running on the node ?
Two things here. New version of Ambari manages both HDP and HDF. And yes you can install HDF services on HDP cluster. Please see the following link.
Created 08-02-2017 10:12 AM
Thank you for your answer. This gives more insights !
It's not for a production setup 🙂