Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How can I build HDP from source?

Highlighted

How can I build HDP from source?

New Contributor

Hey,

I would like to build some HDP components (e.g. Hive) from source to add local patches and I am not planning to use support provided by HDP(its understandable that HDP can't support users with patched deployment).

Can someone please point me to your build scripts(that yield .deb) and documentation on how I can go about doing so?

We use Ubuntu 12.04 based environment(eventually want to move to HDP 2.4 with debian 7) and want to have patched packages (.deb with higher release-number) replace current version as a normal package upgrade.

7 REPLIES 7
Highlighted

Re: How can I build HDP from source?

@Smit Shah The actual code for building a particular release of HDP, which includes a specific release point of the Apache code plus a few Apache-accepted patches for critical bug fixes (documented in the HDP Release Notes), is publicly available in github at (for eg Hive): https://github.com/hortonworks/hive-release . To find the version for a particular release of HDP, look for the HDP version identifier under the github Tags (not branches). For instance, the Hive source code from HDP-2.3.4.7 is at https://github.com/hortonworks/hive-release/tree/HDP-2.3.4.7-tag . The other components have similar "<component>-release" repositories with tagged release points.

Cloning this repository gives you an Apache-formatted build directory. Building it gives you objects which can be safely substituted for the objects on your HDP-installed cluster of the same version, although not in the form of debs.

The build environment has a number of complex pre-requisites, which are documented in Apache. Suggest you start with https://wiki.apache.org/hadoop/HowToContribute (Dev Environment Setup), then after doing everything recommended there, additionally do whatever is needed for the specific hadoop component you are going to work on; eg for Hive see https://cwiki.apache.org/confluence/display/Hive/GettingStarted (especially the subsection "Building Hive from Source", and https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ

Highlighted

Re: How can I build HDP from source?

New Contributor

Thanks for the reply!

We wanted to build debs using the same way HDP builds them, so we can just bump up the version of our patched version of Hive to our apt-get repo and cleanly upgrade the component installed.

So basically once I have built the artifact in environment independent form (archive or whatever) how do I go about packaging it as a deb with same control, post/pre-inst/rm setup as one that HDP repo has?

The idea is to have the capability to upgrade or roll-back patch-releases in a predictable and well-understood way, which a package-manager like apt provides. So we want to refrain from installing things on production in ad-hoc copy-overwrite ways

Highlighted

Re: How can I build HDP from source?

Yes, understood. Unfortunately, the tools for building these debs are not publicly available. The reason is that our support model is built on the opensource, and the Apache opensource only builds per-component jars, not whole-release sets of packages. Apache Bigtop was an attempt at specifying a whole-release packaging, but to the best of my knowledge none of the major supported Hadoop releases use exactly the Bigtop format. Sorry.

Highlighted

Re: How can I build HDP from source?

New Contributor
Hi Matt,
Thanks for the clarification. I understand that these build-scripts are not provided by Apache (and you are not using Bittop packages either). However, I see a lot of value in open-sourcing build-scripts that HDP uses internally. This will allow HDP users to keep most components on stock HDP-built package-versions and patch select components and roll it out in a way that will not break ambari integration or integration with other components.
I am not implying HDP should support users running patched-packages, but making build-scripts available provides much-needed freedom to release patches with confidence and definitely goes a long way in setting users up for success.
Highlighted

Re: How can I build HDP from source?

Hi Smit, you might want to post an "Idea" proposing this. If enough people vote it up, it will be considered by Product Management. Regards -- Matt

Highlighted

Re: How can I build HDP from source?

New Contributor

Thanks for the suggestion!

Is the Hadoop Core the right sub-forum to propose the idea of opensourcing the build-scripts?

Highlighted

Re: How can I build HDP from source?

Yes, I think Hadoop Core would be the best "track" for the idea.

Don't have an account?
Coming from Hortonworks? Activate your account here