Support Questions

sunile_manjee · ‎09-22-2016

Is sharing spark RDD or context a supported in HDP 2.5 via livy server? Everything I see is via zeppelin interrupter to livy. I want to know if strictly using spark is sharing spark (not zeppelin) RDD or context supported.

azeltov · ‎09-22-2016

@Sunile Manjee it is not supported by HDP 2.5 , confirmed that yesterday with @vshukla

View solution in original post

azeltov · ‎09-22-2016

@Sunile Manjee it is not supported by HDP 2.5 , confirmed that yesterday with @vshukla

sunile_manjee · ‎09-23-2016

@azeltov does that include with livy server?

azeltov · ‎09-26-2016

Correct Livy server is only supported as zeppelin integration, not direct REST api call to Livy.

rgelhausen · ‎09-22-2016

@Sunile Manjee

Consider how Spark applications run: a driver runs either on the client, or in a YARN container. If multiple users will ask the same Spark application instance to do multiple things, they need an interface to communicate that to the Driver.

Livy is the out of the box REST interface that shares a single Spark application by presenting the control interface to external users.

If you do not want to use Livy, but still want to share a Spark context, you need to build an external means of communicating with the shared Driver. One solution might be to have the driver periodically pull new queries from a database or from files on disk. This functionality is not builtin to Spark, but could be implemented with a while loop and a sleep statement.

rgelhausen · ‎09-22-2016

*Edit* Realistically, questions about shared SparkContexts are often about

1. Making shared use of cached DataFrames/DataSets

Livy and the Spark Thrift JDBC/ODBC server are decent initial solutions. Keep an eye on Spark-LLAP integration which will be better all around (security, efficiency, etc.)

2. Problems with Spark applications consuming all of a cluster's resources.

Spark's ability to spin up and spin down executor instances dynamically based on utilization is probably a better solution to this problem than sharing a single spark context.

yjiang · ‎10-31-2016

Does ThriftServer in HDP support sharing RDD today?

sunile_manjee · ‎09-23-2016

@Randy Gelhausen Is spark RDD & context sharing supported in 2.5 via livy server?

rgelhausen · ‎09-30-2016

yes

cstanca · ‎09-23-2016

@Sunile Manjee

No without Livy. Yes with Livy (@vshukla). However, it is exposed only to Zeppelin, for now.

Code examples: https://github.com/romainr/hadoop-tutorials-examples/tree/master/notebook/shared_rdd

Cloudera Community

Support Questions

Is sharing spark RDD or context a supported in HDP 2.5?