Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

CDH 5.5 does not have Spark Thrift Server

CDH 5.5 does not have Spark Thrift Server

Explorer

I moved from CDH 5.4.0 to 5.5.0, but I cannot see the $SPARK_HOME/sbin/start-thriftserver.sh. I have a use case that uses the Spark's Thrift server exposing Hive tables to my Tableau visualization. Is the script located elsewhere? If not, how do I make a workaround for this?

13 REPLIES 13

Re: CDH 5.5 does not have Spark Thrift Server

Contributor

You can find the scripts here:

https://github.com/apache/spark/blob/master/sbin/start-thriftserver.sh

https://github.com/apache/spark/blob/master/sbin/stop-thriftserver.sh

 

However, the Spark that ships with CDH 5.5 does not include the Spark Thriftserver.

 

Take a look at this post from Clairvoyant to learn how to build from source:

http://blog.clairvoyantsoft.com/2015/11/how-to-upgrade-spark-on-cdh5-5/

Highlighted

Re: CDH 5.5 does not have Spark Thrift Server

Super Collaborator

The thrift server in Spark is not tested, and might not be compatible, with the Hive version that is in CDH.

Hive in CDH is 1.1 (patched) and Spark uses Hive 1.2.1. You might see API issues during compilation or run time failures due to that.

 

Wilfred

Re: CDH 5.5 does not have Spark Thrift Server

Contributor

Wilfred,

How can one build a Spark release that includes the thrift server and links with the patched version in CDH?

 

Re: CDH 5.5 does not have Spark Thrift Server

Super Collaborator

You will need to change the build to pull in the right version as documented on the Spark pages. The maven repository information for CDH is documented in our generic docs.

 

You would probably get something like

-Dhadoop.version=2.6.0-cdh5.4.0

 

Wilfred

 

 

Re: CDH 5.5 does not have Spark Thrift Server

New Contributor

Mr. Arnold hope your doing well...

 

I went through this on MapR only issues I had were running in secure mode but that might be only MapR but if you run into issues:

http://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbcodbc-server 

 

Chris Horvath

Re: CDH 5.5 does not have Spark Thrift Server

Explorer

Any plans to include Spark Thrift Server natively to CDH Spark?

Re: CDH 5.5 does not have Spark Thrift Server

Contributor

I have finally managed to post instructions on how I am rebuilding Cloudera's Spark to include the thriftserver.  The summary is that you would:

 

git clone https://github.com/cloudera/spark.git
cd spark
./make-distribution.sh -DskipTests \
  -Dhadoop.version=2.6.0-cdh5.7.0 \
  -Phadoop-2.6 \
  -Pyarn \
  -Phive -Phive-thriftserver \
  -Pflume-provided \
  -Phadoop-provided \
  -Phbase-provided \
  -Phive-provided \
  -Pparquet-provided

The post goes into all the details as well as provides a handy Vagrant environment in which to perform the build.

 

 

Re: CDH 5.5 does not have Spark Thrift Server

Explorer

There is a request to add Spark Thrift Server https://issues.cloudera.org/browse/DISTRO-817

please vote up if you want to see that in CDH.

 

 

Re: CDH 5.5 does not have Spark Thrift Server

Contributor

Thanks. Voted also for this feature. Not only Tableau, but Excel, and other apps would have gain from this feature.

But Isn't some workaround to use HiveOnSpark?

From CDH 5.7 it is enabled and works like charm.