Reply
jdb
Explorer
Posts: 10
Registered: ‎07-19-2015

CDH 5.5 does not have Spark Thrift Server

I moved from CDH 5.4.0 to 5.5.0, but I cannot see the $SPARK_HOME/sbin/start-thriftserver.sh. I have a use case that uses the Spark's Thrift server exposing Hive tables to my Tableau visualization. Is the script located elsewhere? If not, how do I make a workaround for this?

Explorer
Posts: 21
Registered: ‎10-27-2015

Re: CDH 5.5 does not have Spark Thrift Server

You can find the scripts here:

https://github.com/apache/spark/blob/master/sbin/start-thriftserver.sh

https://github.com/apache/spark/blob/master/sbin/stop-thriftserver.sh

 

However, the Spark that ships with CDH 5.5 does not include the Spark Thriftserver.

 

Take a look at this post from Clairvoyant to learn how to build from source:

http://blog.clairvoyantsoft.com/2015/11/how-to-upgrade-spark-on-cdh5-5/

Cloudera Employee
Posts: 241
Registered: ‎01-16-2014

Re: CDH 5.5 does not have Spark Thrift Server

The thrift server in Spark is not tested, and might not be compatible, with the Hive version that is in CDH.

Hive in CDH is 1.1 (patched) and Spark uses Hive 1.2.1. You might see API issues during compilation or run time failures due to that.

 

Wilfred

Highlighted
Explorer
Posts: 21
Registered: ‎10-27-2015

Re: CDH 5.5 does not have Spark Thrift Server

Wilfred,

How can one build a Spark release that includes the thrift server and links with the patched version in CDH?

 

Cloudera Employee
Posts: 241
Registered: ‎01-16-2014

Re: CDH 5.5 does not have Spark Thrift Server

You will need to change the build to pull in the right version as documented on the Spark pages. The maven repository information for CDH is documented in our generic docs.

 

You would probably get something like

-Dhadoop.version=2.6.0-cdh5.4.0

 

Wilfred

 

 

New Contributor
Posts: 1
Registered: ‎03-03-2016

Re: CDH 5.5 does not have Spark Thrift Server

Mr. Arnold hope your doing well...

 

I went through this on MapR only issues I had were running in secure mode but that might be only MapR but if you run into issues:

http://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbcodbc-server 

 

Chris Horvath

Explorer
Posts: 9
Registered: ‎07-28-2015

Re: CDH 5.5 does not have Spark Thrift Server

Any plans to include Spark Thrift Server natively to CDH Spark?

Explorer
Posts: 21
Registered: ‎10-27-2015

Re: CDH 5.5 does not have Spark Thrift Server

I have finally managed to post instructions on how I am rebuilding Cloudera's Spark to include the thriftserver.  The summary is that you would:

 

git clone https://github.com/cloudera/spark.git
cd spark
./make-distribution.sh -DskipTests \
  -Dhadoop.version=2.6.0-cdh5.7.0 \
  -Phadoop-2.6 \
  -Pyarn \
  -Phive -Phive-thriftserver \
  -Pflume-provided \
  -Phadoop-provided \
  -Phbase-provided \
  -Phive-provided \
  -Pparquet-provided

The post goes into all the details as well as provides a handy Vagrant environment in which to perform the build.

 

 

Explorer
Posts: 9
Registered: ‎07-28-2015

Re: CDH 5.5 does not have Spark Thrift Server

There is a request to add Spark Thrift Server https://issues.cloudera.org/browse/DISTRO-817

please vote up if you want to see that in CDH.

 

 

Contributor
Posts: 26
Registered: ‎01-11-2016

Re: CDH 5.5 does not have Spark Thrift Server

Thanks. Voted also for this feature. Not only Tableau, but Excel, and other apps would have gain from this feature.

But Isn't some workaround to use HiveOnSpark?

From CDH 5.7 it is enabled and works like charm.

Announcements