I would like to build some HDP components (e.g. Hive) from source to add local patches and I am not planning to use support provided by HDP(its understandable that HDP can't support users with patched deployment).
Can someone please point me to your build scripts(that yield .deb) and documentation on how I can go about doing so?
We use Ubuntu 12.04 based environment(eventually want to move to HDP 2.4 with debian 7) and want to have patched packages (.deb with higher release-number) replace current version as a normal package upgrade.
@Smit Shah The actual code for building a particular release of HDP, which includes a specific release point of the Apache code plus a few Apache-accepted patches for critical bug fixes (documented in the HDP Release Notes), is publicly available in github at (for eg Hive): https://github.com/hortonworks/hive-release . To find the version for a particular release of HDP, look for the HDP version identifier under the github Tags (not branches). For instance, the Hive source code from HDP-184.108.40.206 is at https://github.com/hortonworks/hive-release/tree/HDP-220.127.116.11-tag . The other components have similar "<component>-release" repositories with tagged release points.
Cloning this repository gives you an Apache-formatted build directory. Building it gives you objects which can be safely substituted for the objects on your HDP-installed cluster of the same version, although not in the form of debs.
The build environment has a number of complex pre-requisites, which are documented in Apache. Suggest you start with https://wiki.apache.org/hadoop/HowToContribute (Dev Environment Setup), then after doing everything recommended there, additionally do whatever is needed for the specific hadoop component you are going to work on; eg for Hive see https://cwiki.apache.org/confluence/display/Hive/GettingStarted (especially the subsection "Building Hive from Source", and https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ
Thanks for the reply!
We wanted to build debs using the same way HDP builds them, so we can just bump up the version of our patched version of Hive to our apt-get repo and cleanly upgrade the component installed.
So basically once I have built the artifact in environment independent form (archive or whatever) how do I go about packaging it as a deb with same control, post/pre-inst/rm setup as one that HDP repo has?
The idea is to have the capability to upgrade or roll-back patch-releases in a predictable and well-understood way, which a package-manager like apt provides. So we want to refrain from installing things on production in ad-hoc copy-overwrite ways
Yes, understood. Unfortunately, the tools for building these debs are not publicly available. The reason is that our support model is built on the opensource, and the Apache opensource only builds per-component jars, not whole-release sets of packages. Apache Bigtop was an attempt at specifying a whole-release packaging, but to the best of my knowledge none of the major supported Hadoop releases use exactly the Bigtop format. Sorry.
Hi Smit, you might want to post an "Idea" proposing this. If enough people vote it up, it will be considered by Product Management. Regards -- Matt
Thanks for the suggestion!
Is the Hadoop Core the right sub-forum to propose the idea of opensourcing the build-scripts?