Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

differences between hdp (Hortonworks Data Platform) and hdf (Hortonworks Data flow)

avatar
Contributor

Hello, I'm new in the domain of Big data and i want to know the main differences between HDF(Hortonworks Data Flow) and HDP (Hortonworks Data Platform). that's means the usecases and the architecture(components) of each one.

6 REPLIES 6

avatar
Master Mentor

avatar
Contributor

Thank you.

I understood from the last link that :

HDF - is used to handle Data in Motion

HDP - is used to handle Data at Rest

But HDP contains storm (real time message processing) and Kafka ( distributed messaging system ).

So can we say that HDP can be used also to handle data in motion ?

avatar
Contributor

Hi,

Yes currently HDP (2.6.x) does contain Kafka and Strom but according to the release notes https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_release-notes/content/deprecated_items.h... those components will be removed from HDP starting from version 3.0.0

It means that in near future HDP will not handle data in motion anymore.

avatar
Contributor

Thank you @Andres Koitmäe

But according to this link storm and kafka will not be removed from HDP version 3.0.0

avatar
Contributor

Yes, they will be moved from HDP starting from HDP 3.0.0.

The following information is taken from release notes:

The following components are marked moving from HDP and will be moved in a future HDP release to an alternative Hortonworks Subscription and Offering:

Component or CapabilityStatusMarked Moving as ofTarget Release for Move
Apache AccumuloMovingHDP 2.6.0HDP 3.0.0
Apache KafkaMovingHDP 2.6.0HDP 3.0.0
Apache StormMovingHDP 2.6.0HDP 3.0.0
CloudbreakMovingHDP 2.6.0HDP 3.0.0

avatar

What is the reason behind separation into HDP and HDF? It is very often that company needs both real-time data processing and batch processing, why not to make single package?