05-28-2015 10:18 AM
Is there any way to run multiple instances of pyspark consoles using --master yarn-client option?
The second console only start to work after killing the first one.
My actual goal is to configure IPython Notebook like described here (http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/), but when I open another notepad, and notebook creates another kernel, this fires another "pyspark console".
I think if I figure out how to solve the first problem, the second (actual problem) will be solved too.
06-02-2015 07:06 PM
You can start multiple pysparks on one host under the same user name. The shell, just as with the scala shell, will find an unused port ans allow you to do what is needed.
There is no limitation on the pyspark side you need to work around.
I am not sure how the notebook needs to be configured to allow multiple to run at once.
09-08-2017 10:10 AM
You need to copy hive-site.xml to SPARK_HOME to make sure you are no using derby. Then it still blocks and does not allow multiple sessions. There is still some problem.