We are having problems with parallel use of PySpark scripts by Jypyter. We currently have a team of 3 people who need to use Jupyter to encode in parallel, (different scripts).
The first person to connect, creates a Spark context successfully, via Livy session. The following people can create other contexts, but the server is slow and the following message appears:
When checking in the YARN we only have 1 Livy session in progress, the following do not appear.
Our cluster has 128 GB of RAM, which should be sufficient for at least 2 parallel sessions, which is not currently possible. Currently we are all using the same user to access the Ambari.
1) How can we parallelize the access of 3 people at the same time, to program in Jupyter?
2) Is 128 GB RAM enough for this job?
3) If we have 1 access user for each person, can this work be done in parallel?