Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Building Spark from source with Thrift and Hive and installing to Quickstart VM

Building Spark from source with Thrift and Hive and installing to Quickstart VM

Explorer

I'm using two versions of Quickstart VM, 5.4.0 and 5.5.0, which I've used in different projects and POC.

 

Both have default Spark installations (versions 1.3.0 and 1.5.0 respectively) and Hive v1.1.0. Now I want to build Spark v1.4.0 from source in addition to what were installed already (on both VMs). in the case of Quickstart VM 5.5.0, I want to add sbin/start-thriftserver.sh, which is missing on its default Spark.

 

According to Apache Spark documentation, to compile from source with Thrift:

 

 

$ mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -DskipTests clean package

 

 

However, this will bind to Hive 0.13.1 by default. If I would be building from source, I want to bind it to Quickstart VMs current versions of Hive, v1.1.0 (or precisely 1.1.0-cdh5.4.0 and 1.1.0-cdh5.5.0).

 

How can I do this bindings? Should I replace hive.version property in Spark source pom.xml from 0.13.1 to 1.1.0 before building? 

 

<hive.version>0.13.1a</hive.version>

Do I need to take into account the -cdh5.x.x extensions in Hive? Do I need to set a particular repository from Cloudera?

 

 

 

1 REPLY 1

Re: Building Spark from source with Thrift and Hive and installing to Quickstart VM

Contributor

Take a look at this post from Clairvoyant:

http://blog.clairvoyantsoft.com/2015/11/how-to-upgrade-spark-on-cdh5-5/

 

You probably want something like -Dhadoop.version=2.6.0-cdh5.5.0