Support Questions
Find answers, ask questions, and share your expertise

how to setup PYSPARK_PYTHON from jupyter lab


I have a use case where we have some pyspark code(using python kernel connecting to pyspark using yarn-client mode) and we need to specify a conda environment to use in the executors, but for some reason the PYSPARK_PYTHON setup seems to be ignored when running on jupyter(using a corporate jupyterlab instance), if the notebook is exported as .py file and ran via livy, the configuration works fine.

I have already tried many things(none worked):

  • setting the following in the beggining of the notebok os.environ['PYSPARK_PYTHON'] = ""
  • adding it at spark session creation .config("spark.yarn.appMasterEnv.PYSPARK_PYTHON",test_path)
  • adding it at spakr session creation .config("spark.pyspark.python",test_path)

I saw some other people setting it using CLI env variables via:
export PYSPARK_PYTHON= the path
And then running jupyter notebook, but i cannot do it because i use a corporate jupyter lab instance so i tried adding it at but it did not work, reference to SO question:

Does anybody knows how should I specify the PYSPARK_PYTHON from jupyter lab to use a specific conda environment?