Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Post a Spark Job as JAR via Livy interactive (session) REST interface

Post a Spark Job as JAR via Livy interactive (session) REST interface

Expert Contributor

I want to implement a Spark/Livy usecase, where users can share one and the same Spark context and so have access to the same pre-cached RDDs. So, let me explain this on the following scenario:

  1. Process starts a new Livy Session. There it reads some data into a RDD and caches it.
  2. User A wants to work on the same context to get access to this pre-cached RDD
  3. User B has another application that can use this cached RDD

The only way I found so far is through the Java/Scala API of Livy: https://livy.incubator.apache.org/examples/

The other way, using the REST API only supports either

  • running a Spark application using a JAR file in batch mode (URL /batches)
  • running separate lines of code in session mode (URL /sessions)

As I'm coming from an R application, I need to start these Spark applications (as JAR) somehow via POST request to Livy's /sessions URL, as I need the session context for sharing the RDD.

I read something of a /sessions/<id>/submit-job URL but I don't know how to use it as it is neverwhere documented.

Can someone help?

Don't have an account?
Coming from Hortonworks? Activate your account here