11-22-2017 03:26 AM
Object sharing between different spark-submit jobs is not there currently. However, it immensely helps if we know your use-case in as much detail as possible and the problem you are trying to solve with sharing dataframes.
My understanding is if the data changes infrequently and caching is a must have, you can use HDFS caching. If the data changes often i.e. records will constantly be updated and the data has to be shared among many different applications: use Kudu. Kudu already has basic caching capabilities where frequently read subsets of data are automatically cached.
There was a previous thread awhile back around the same lines and some options that you could explore (though unsupported) are Spark-JobServer or Tachyon. However, I have not used them and can't comment beyond the references.