There is a request to add Spark Thrift Server https://issues.cloudera.org/browse/DISTRO-817
please vote up if you want to see that in CDH.
Created 02-04-2016 08:09 PM
I moved from CDH 5.4.0 to 5.5.0, but I cannot see the $SPARK_HOME/sbin/start-thriftserver.sh. I have a use case that uses the Spark's Thrift server exposing Hive tables to my Tableau visualization. Is the script located elsewhere? If not, how do I make a workaround for this?
Created 02-25-2016 03:10 PM
You can find the scripts here:
https://github.com/apache/spark/blob/master/sbin/start-thriftserver.sh
https://github.com/apache/spark/blob/master/sbin/stop-thriftserver.sh
However, the Spark that ships with CDH 5.5 does not include the Spark Thriftserver.
Take a look at this post from Clairvoyant to learn how to build from source:
http://blog.clairvoyantsoft.com/2015/11/how-to-upgrade-spark-on-cdh5-5/
Created 02-25-2016 07:26 PM
The thrift server in Spark is not tested, and might not be compatible, with the Hive version that is in CDH.
Hive in CDH is 1.1 (patched) and Spark uses Hive 1.2.1. You might see API issues during compilation or run time failures due to that.
Wilfred
Created 02-29-2016 10:13 AM
Wilfred,
How can one build a Spark release that includes the thrift server and links with the patched version in CDH?
Created 02-29-2016 05:12 PM
You will need to change the build to pull in the right version as documented on the Spark pages. The maven repository information for CDH is documented in our generic docs.
You would probably get something like
-Dhadoop.version=2.6.0-cdh5.4.0
Wilfred
Created 03-03-2016 11:40 AM
Mr. Arnold hope your doing well...
I went through this on MapR only issues I had were running in secure mode but that might be only MapR but if you run into issues:
http://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbcodbc-server
Chris Horvath
Created 06-03-2016 02:20 PM
Any plans to include Spark Thrift Server natively to CDH Spark?
Created 06-09-2016 09:12 AM
I have finally managed to post instructions on how I am rebuilding Cloudera's Spark to include the thriftserver. The summary is that you would:
git clone https://github.com/cloudera/spark.git cd spark ./make-distribution.sh -DskipTests \ -Dhadoop.version=2.6.0-cdh5.7.0 \ -Phadoop-2.6 \ -Pyarn \ -Phive -Phive-thriftserver \ -Pflume-provided \ -Phadoop-provided \ -Phbase-provided \ -Phive-provided \ -Pparquet-provided
The post goes into all the details as well as provides a handy Vagrant environment in which to perform the build.
Created 06-30-2016 11:53 AM
There is a request to add Spark Thrift Server https://issues.cloudera.org/browse/DISTRO-817
please vote up if you want to see that in CDH.
Created 12-13-2016 08:07 AM
Thanks. Voted also for this feature. Not only Tableau, but Excel, and other apps would have gain from this feature.
But Isn't some workaround to use HiveOnSpark?
From CDH 5.7 it is enabled and works like charm.