Hello, so the bottom line is, we want to have the power of certain packages from HDP such as Hive, Pig and Flume but also want to have the ability to manage our Dataflow using Nifi. We currently have HDP installed which gives us no access to install Nifi. I am concerned about wiping our HDP installation and installed HDF because I don't see Flume, Hive or Pig (among others) as options for packages included (I see we can add services but I have been searching and cannot find evidence that those packages are available to add)
1. Is it possible to add Nifi to HDP and then manage through Ambari?
2. If not and we must have HDF, can anyone reference me to a list of the services we can add to HDF (hopefully Flume, Pig, Hive etc will be part of it)
Not sure I understand why we cannot have all of these awesome services from Hortonworks working together and have to pick one or the other.
Thanks for any assistance.
At this time HDF and HDP are two separate product lines each being managed by its own Ambari installation. So you would typically have your HDP managed cluster up and running and a separate HDF Ambari managed cluster up and running, NiFi within the HDF cluster can communicate with the HDP services. They do not need to be part of the same cluster.
The latest version of HDF includes the following services:
While there is no official Hortonworks release that supports having both HDF and HDP services managed under a single Ambari at this time. Other have been successful at deploying the NiFi service within HDP clusters. There are drawback such as doing so breaks the upgrade capability of HDP. It would be hard to get any kind of support for that particular setup. But in case you want to play around with that...
That seems simple and reasonable but I think there is an overlap.
Kafka and Storm are clearly services to assist with Data in Motion and yet they are on both HDP and HDF.
The issue here is that Nifi is not on HDP so it is the superior data managing tool. Its unfortunate that it requires an additional cluster for HDF so that Nifi can be used with the processing power of HDP on another cluster. Sometimes the constraints of hardware prevent another cluster from being created. I suspect a decision to maintain the processing power of HDP and just use Flume instead of the amazing tool they created Nifi would be the best idea for a project that could only have one cluster.