Support Questions

Find answers, ask questions, and share your expertise

Confusion with HDF and HDP and their capabilities

Rising Star

Hello, so the bottom line is, we want to have the power of certain packages from HDP such as Hive, Pig and Flume but also want to have the ability to manage our Dataflow using Nifi. We currently have HDP installed which gives us no access to install Nifi. I am concerned about wiping our HDP installation and installed HDF because I don't see Flume, Hive or Pig (among others) as options for packages included (I see we can add services but I have been searching and cannot find evidence that those packages are available to add)

1. Is it possible to add Nifi to HDP and then manage through Ambari?

2. If not and we must have HDF, can anyone reference me to a list of the services we can add to HDF (hopefully Flume, Pig, Hive etc will be part of it)

Not sure I understand why we cannot have all of these awesome services from Hortonworks working together and have to pick one or the other.

Thanks for any assistance.

5 REPLIES 5

Master Guru

@Eric Lloyd

At this time HDF and HDP are two separate product lines each being managed by its own Ambari installation. So you would typically have your HDP managed cluster up and running and a separate HDF Ambari managed cluster up and running, NiFi within the HDF cluster can communicate with the HDP services. They do not need to be part of the same cluster.

------------------

The latest version of HDF includes the following services:

NiFi

Storm

Kafka

Zookeeper

Ambari

Ranger

--------------------

Thanks,

Matt

Master Guru

@Eric Lloyd

While there is no official Hortonworks release that supports having both HDF and HDP services managed under a single Ambari at this time. Other have been successful at deploying the NiFi service within HDP clusters. There are drawback such as doing so breaks the upgrade capability of HDP. It would be hard to get any kind of support for that particular setup. But in case you want to play around with that...

https://github.com/abajwa-hw/ambari-nifi-service

Matt

Contributor

@Eric Lloyd

From Hortonworks reference also, fundamental difference is that.

HDF - To handle Data in Motion

HDP - To handle Data at Rest

Rising Star

That seems simple and reasonable but I think there is an overlap.

Kafka and Storm are clearly services to assist with Data in Motion and yet they are on both HDP and HDF.

The issue here is that Nifi is not on HDP so it is the superior data managing tool. Its unfortunate that it requires an additional cluster for HDF so that Nifi can be used with the processing power of HDP on another cluster. Sometimes the constraints of hardware prevent another cluster from being created. I suspect a decision to maintain the processing power of HDP and just use Flume instead of the amazing tool they created Nifi would be the best idea for a project that could only have one cluster.

Cloudera Employee

Hi, with HDF3.0, Ambari 2.5.1 Both HDP and HDF can be managed under one ambari. Have a look at

https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.2/index.html

Thanks,

Avijeet