Is sharing spark RDD or context a supported in HDP 2.5 via livy server? Everything I see is via zeppelin interrupter to livy. I want to know if strictly using spark is sharing spark (not zeppelin) RDD or context supported.
Consider how Spark applications run: a driver runs either on the client, or in a YARN container. If multiple users will ask the same Spark application instance to do multiple things, they need an interface to communicate that to the Driver.
Livy is the out of the box REST interface that shares a single Spark application by presenting the control interface to external users.
If you do not want to use Livy, but still want to share a Spark context, you need to build an external means of communicating with the shared Driver. One solution might be to have the driver periodically pull new queries from a database or from files on disk. This functionality is not builtin to Spark, but could be implemented with a while loop and a sleep statement.
*Edit* Realistically, questions about shared SparkContexts are often about
1. Making shared use of cached DataFrames/DataSets
Livy and the Spark Thrift JDBC/ODBC server are decent initial solutions. Keep an eye on Spark-LLAP integration which will be better all around (security, efficiency, etc.)
2. Problems with Spark applications consuming all of a cluster's resources.
Spark's ability to spin up and spin down executor instances dynamically based on utilization is probably a better solution to this problem than sharing a single spark context.
No without Livy. Yes with Livy (@vshukla). However, it is exposed only to Zeppelin, for now.