Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

What is the difference between HDP Repo and Apache tarball

avatar
Explorer

I am trying to build a system of big-data platform ,with the tools hadoop, hive, hbase, pig, hive, storm, zookeeper. As per my knowledge we can either install it using tarball file given on apache website for each tools or we can use HDP repositories and ambari to build the things.( Am not looking for Cloudera ). As mentioned their respective document both come with Apache Licence 2.0. .

  1. What is the difference between this two,ie apache tarball vs hdp repo?
  2. Is enterprise licence is required for using hdp repositories in production system?
  3. what is the best way to do things in a production system
1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Anas A

1) HDP is a stack that is maintained by Hortonworks. It is a collection of services and versions of the services certified by Hortonworks to work together as a hadoop system. With a version of HDP "stack", you will have a recommended set of versions of services installed.

You can see the growth of the HDP stack in the diagram titled "Ongoing innovation in Apache", here :

http://hortonworks.com/products/data-center/hdp/

2) To use HDP repo you don't need an enterprise license. HDP is completely open source

3) Before starting off things in a production system, you may want to check install using sandbox and get familiar with HDP:

http://hortonworks.com/hadoop-tutorial/learning-the-ropes-of-the-hortonworks-sandbox/

and then go ahead and look at :

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_installing_manually_book/content/ch_gett...

To get a starting point into HDP docs, look at :

http://hortonworks.com/downloads/#data-platform

and

http://docs.hortonworks.com/index.html -- This has docs for every version of HDP and ambari

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

@Anas A

1) HDP is a stack that is maintained by Hortonworks. It is a collection of services and versions of the services certified by Hortonworks to work together as a hadoop system. With a version of HDP "stack", you will have a recommended set of versions of services installed.

You can see the growth of the HDP stack in the diagram titled "Ongoing innovation in Apache", here :

http://hortonworks.com/products/data-center/hdp/

2) To use HDP repo you don't need an enterprise license. HDP is completely open source

3) Before starting off things in a production system, you may want to check install using sandbox and get familiar with HDP:

http://hortonworks.com/hadoop-tutorial/learning-the-ropes-of-the-hortonworks-sandbox/

and then go ahead and look at :

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_installing_manually_book/content/ch_gett...

To get a starting point into HDP docs, look at :

http://hortonworks.com/downloads/#data-platform

and

http://docs.hortonworks.com/index.html -- This has docs for every version of HDP and ambari

avatar
Super Guru

@Anas A

@sbhat is correct. I would like to add to her response that HDP stack is 100% open source based on Apache. It is a tested platform as such tools from the ecosystem can work together and deliver enterprise level quality. Taking the tools from Apache does not assure that they will work smoothly together.

There is no concept of license associated with HDP. You can use the distribution as-is, however, enterprises elect to purchase paid support as such that can receive 24x7 support and get the chance to influence the roadmap or receive special attention on critical issues. Hortonworks engineers are actively involved in Hadoop ecosystem tools development and they can help with addressing bugs or including features that the community would like to have added.

Best for you would be to start with downloading the sandbox as @sbhat suggested.

Good luck!