Reply
New Contributor
Posts: 5
Registered: ‎08-15-2013

How to integrate Cloudera Hadoop/Hbase/Zookeeper packages into my product installation

Hi,

 

I recently started working on opentsdb/hadoop/hbase.

Presently I have integrated opentsdb into my product installation. and running opentsdb on top of Hbase/hadoop from Apache.

Now I would like to move to hadoop/hbase/zookeeper from cloudera.

 

I would like to take binary/source rpms and integrate them to my product installation repo.

Could you please suggest me what packages from Cloudera I need to take for this.

 

Thanks & Regards,

OC.

Cloudera Employee
Posts: 62
Registered: ‎07-29-2013

Re: How to integrate Cloudera Hadoop/Hbase/Zookeeper packages into my product installation

Hi OC,

The installation instructions are available at http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/C...

 

That's for CDH4.3.0, the latest release of CDH at this point.

 

For Hadoop, depending on how you want to set up your cluster (distributed vs. pseudo-distributed), and what MR version you want to use (MR1 vs. MR2), you will have to decide what packages you want to install where. For example, the package for MR1 pseudo-distributed cluster would be hadoop-0.20-conf-pseudo.

 

Feel free to post back if you have any further questions.

 

The hbase package is simply called hbase.

 

So, a simple command like sudo yum install hadoop-0.20-conf-pseudo hbase would work once you have added the CDH4 repository on your host.

New Contributor
Posts: 5
Registered: ‎08-15-2013

Re: How to integrate Cloudera Hadoop/Hbase/Zookeeper packages into my product installation

Hi,

Thanks for the response.
I would like to install on a cluster and the size varies from customer to customer.
I prefer MR1 as I see MR2 is not backward compatible (If i understood correctly; please correct).
We cannot install using "yum" because we always need to ship our solution with source/binary rpms and install as part of "make install".

My target OS is RHEL6.
Will it work if i just take binary rpms from CDH4 and install.

Thanks & Regards,
OC
Expert Contributor
Posts: 63
Registered: ‎08-06-2013

Re: How to integrate Cloudera Hadoop/Hbase/Zookeeper packages into my product installation

MR2 is not backward compatible (If i understood correctly; please correct).

 


"The Java client APIs for MR2 are compatible with the corresponding APIs in MR1 (see for example JobClient). There is no change required to any code that was written using the old client APIs; applications using such client APIs can directly switch to use MR2.

While the Java client API is compatible in MR1 and MR2, configuration properties are generally not. Due to the major changes that were introduced in MR2, a number of old configuration properties are no longer valid and new properties were added.

Another class of incompatibility between MR2 and MR1 is the use of HTTP servlets. A number of the MR1 servlets are no longer available in MR2 and new servlets were added."

 

http://blog.cloudera.com/blog/2012/07/experimenting-with-mapreduce-2-0/

Explorer
Posts: 12
Registered: ‎07-29-2013

Re: How to integrate Cloudera Hadoop/Hbase/Zookeeper packages into my product installation

One possibility is to look at https://github.com/cloudera/cdh-package/.  With quite a bit of hacking i was able to successfully build components of cdh4.3 using that.  With some understanding of bash and make you should be able to use it as a base and modify for your purposes.

 

There are branches for all of the CDH releases.

Bryan Beaudreault
Senior Technical Lead, Data Ops
HubSpot, Inc
New Contributor
Posts: 5
Registered: ‎08-15-2013

Re: How to integrate Cloudera Hadoop/Hbase/Zookeeper packages into my product installation

I tried to build from the hadoop source rpm from http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/4/SRPMS/

My product installation runs Java-1.7 but cloudera looks for 1.6

 

[INFO] ------------------------------------------------------------------------
[INFO] Building Apache Hadoop Main 2.0.0-cdh4.3.0
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-enforcer-plugin:1.0:enforce (default) @ hadoop-main ---
[WARNING] Rule 1: org.apache.maven.plugins.enforcer.RequireJavaVersion failed with message:
Detected JDK Version: 1.7.0-25 is not in the allowed range [1.6.0,1.6.1000}].
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Hadoop Main ................................ FAILURE [0.434s]

Could you please suggest, how to build with Java-1.7

 

Thanks & Regards,

OC.

 

 

Expert Contributor
Posts: 63
Registered: ‎08-06-2013

Re: How to integrate Cloudera Hadoop/Hbase/Zookeeper packages into my product installation

New Contributor
Posts: 5
Registered: ‎08-15-2013

Re: How to integrate Cloudera Hadoop/Hbase/Zookeeper packages into my product installation

I treid 4.3.0. still the same issue
New Contributor
Posts: 5
Registered: ‎08-15-2013

Re: How to integrate Cloudera Hadoop/Hbase/Zookeeper packages into my product installation

If i build CDH hadoop source rpm with JDK-1.6 and install on a system running JRE/jdk 1.7 version, will it work fine?
Expert Contributor
Posts: 63
Registered: ‎08-06-2013

Re: How to integrate Cloudera Hadoop/Hbase/Zookeeper packages into my product installation

Should be fine to run with a latter version. Not fine to compile with a latter version and run with an earlier version.