Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

can kafka multiple nodes work by itself without any cluster?

Solved Go to solution
Highlighted

can kafka multiple nodes work by itself without any cluster?

Contributor

my project need to stream data from kafka to mongodb, so we did setup HDP cluster and kafka multiple nodes within.

After all, I feel we may process all data by using kafka/zookeeper alone, no need to have cluster. any one can tell if kafka stream can work itself without in a cluster?

if yes, anythine I need to aware?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: can kafka multiple nodes work by itself without any cluster?

Cloudera Employee

Hi @Robin Dong

Yes, you definitely can install Kafka itself(yes, also need zookeeper) as a cluster. You can check this Kafka Multi Broker Doc as a reference.

As for HDP cluster, I think you have some misunderstanding of Hortonworks Data Platform(HDP) cluster.

The Kafka is already a cluster. And Zookeeper also works as a cluster.
The HDP is a Hadoop Distribution, and it use Ambari to help you manage the different components in your cluster in a single page.
And HDP can be highly costumed, you can only install Kafka and Zookeeper when you install the cluster. It's very convenient when HDP use Ambari to install those components.
So indeed you can install the Kafka and Zookeeper manually, I suggest you install them with HDP, because it quite easy and it can automatically help you integrate Kafka and Zookeeper together. And with ambari view, you can see many different metrics of Kafka and Zookeeper which can help you to check the health of your cluster.

If you more emphasis on the Data Stream. I suggest you to try Hortonworks Data Flow(HDF) . Because the main components in HDF is Kafka/Storm/Zookeeper/NiFi. And also you can tailor HDF by yourself.

Cheers,

View solution in original post

6 REPLIES 6
Highlighted

Re: can kafka multiple nodes work by itself without any cluster?

Cloudera Employee

Hi @Robin Dong

Yes, you definitely can install Kafka itself(yes, also need zookeeper) as a cluster. You can check this Kafka Multi Broker Doc as a reference.

As for HDP cluster, I think you have some misunderstanding of Hortonworks Data Platform(HDP) cluster.

The Kafka is already a cluster. And Zookeeper also works as a cluster.
The HDP is a Hadoop Distribution, and it use Ambari to help you manage the different components in your cluster in a single page.
And HDP can be highly costumed, you can only install Kafka and Zookeeper when you install the cluster. It's very convenient when HDP use Ambari to install those components.
So indeed you can install the Kafka and Zookeeper manually, I suggest you install them with HDP, because it quite easy and it can automatically help you integrate Kafka and Zookeeper together. And with ambari view, you can see many different metrics of Kafka and Zookeeper which can help you to check the health of your cluster.

If you more emphasis on the Data Stream. I suggest you to try Hortonworks Data Flow(HDF) . Because the main components in HDF is Kafka/Storm/Zookeeper/NiFi. And also you can tailor HDF by yourself.

Cheers,

View solution in original post

Re: can kafka multiple nodes work by itself without any cluster?

Contributor

Thank you very much Wang for confirm the zk and kf and work alone.

yes, with HDP, kafka and zookeeper is better administrated and monitored. I did setup setup kafka cluster and mongodb with HDP, it seemed very easy steps.

However, I tried to save some money for our company, so I came up with this question. thank you for confirm it.

Highlighted

Re: can kafka multiple nodes work by itself without any cluster?

Cloudera Employee

@Robin Dong

The HDP is also free, and ambari agent don't consume much resource.

Feel free to use it.

Highlighted

Re: can kafka multiple nodes work by itself without any cluster?

Contributor

yes, you are right HDP is license free. however the cluster install needs master and slave infrastructure, so all the HDFS and data name, name node, yarn alone with some mandatory like Hive, pig, tez need installed and maintained.

All of these cost a lots on AWS/EC2.

in other hand, without HDP cluster, the monitoring, upgrade, version compatiblility and security on kafka, spark, zookeeper may bring issues in long run. admin have to deal with it.

I am looking for use case of zookeeper/kafka cluster and spark to compare to see if it is worth to do so.

How do you think?

Highlighted

Re: can kafka multiple nodes work by itself without any cluster?

Cloudera Employee

@Robin Dong

That why I suggest you to use HDF, which you can only install zookeeper and kafka.

Highlighted

Re: can kafka multiple nodes work by itself without any cluster?

Contributor

Thank you so much.

Don't have an account?
Coming from Hortonworks? Activate your account here