Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Please see the Cloudera blog for information on the Cloudera Response to CVE-2021-4428

HDF & HDP on the same nodes

New Contributor

Hello!

Found interesting information:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_release-notes/content/deprecated_items.h...

Starting from HDP 3.0 release Kafka will be moved from HDP. So it will be supported only in HDF

Our case:

Cluster1:

We have a Hadoop cluster on HDP.

Cluster2:

Also we have seperated N nodes for Kafka cluster.

But on this seperate cluster we install HDP with: Kafka/Yarn/Hadoop/Spark2/Oozie

Second cluster used as DataBus and Streaming data processing cluster.

Yarn&HDFS installed only for need of Spark2. We does not store any data in this HDFS

Oozie is used for job configurations to start/stop/monitor jobs.

The reason why we run spark2 streaming jobs not on the Cluster1 - is resources and stability. Cluster2 (Kafka cluster) have a lot of free momory and unused CPUs. Cluster1 have no free resources and always used for 100%.

In our case streaming jobs should be as stable as possible, but we does not want to limit resources of main cluster (Cluster 1).

Up to version 2.6 everything was OK.

But if Kafka will be moved I can't understand how to install HDF (kafka) & HDP (HDFS+Yarn+Spark2+Oozie) on the same nodes.

The solutions:

1#

- install HDP 3 (HDFS+Yarn+Spark2+Oozie)

- install standalone Kafka, or another platform (e.g. Confluent Platform), but we would prefer to use Hortonworks stack

2#

- install HDF 3 (Kafka)

- manual HDFS+Yarn installation (very complicated task)

3#

- install HDF 3 (Kafka)

- run Spark streming jobs not on Yarh cluster, but on Spark own managed Cluster on the same node as Kafka.

This is very complicated tasks because we should reconfigure and test more than 30 streaming jobs and rewrite our monitoring engine.

Up to now we would chose solution #1. But It would be great if we could combine power of HDP and HDF on the same nodes.

4 REPLIES 4

@Alexey Vovchenko

The latest version of Ambari supports installing and managing both HDP and HDF. If you upgrade Ambari on the HDP cluster, you could install the HDF management pack and then install Kafka via Ambari that way.

New Contributor

Thanks a lot for the help!

@Michael Young, Will ambari 2.5.0.3 support both HDP and HDF (as management pack) or ambari 2.5.1.0 is required?

Explorer

@Michael Young , I am using Ambai 2.6.0.0. Will this ambari support the Nifi Demo by installing Mpack? I am trying to make it happen but failed. Would you please help me How can I run nifi from HDP sandbox? Thank you.