<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Cannot get pyspark to work (Creating Spark Context) with FileNotFoundError: [Errno 2] No such file or directory: '/usr/hdp/current/spark-client/./bin/spark-submit' in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Cannot-get-pyspark-to-work-Creating-Spark-Context-with/m-p/351214#M236195</link>
    <description>&lt;P&gt;Hi&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am using the cloudera hortonworks sandbox docker image, and have followed this tutorial to run&amp;nbsp; Jupyter notebooks: &lt;A href="https://community.cloudera.com/t5/Support-Questions/Installing-Jupyter-on-sandbox/td-p/201683" target="_self"&gt;https://community.cloudera.com/t5/Support-Questions/Installing-Jupyter-on-sandbox/td-p/201683&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This works. The notebook is started using the python kernal. The error is encountered when attempting to create the spark context:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;FileNotFoundError: [Errno 2] No such file or directory: '/usr/hdp/current/spark-client/./bin/spark-submit'&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;FileNotFoundError                         Traceback (most recent call last)
&amp;lt;ipython-input-4-fbb9eeb69493&amp;gt; in &amp;lt;module&amp;gt;
----&amp;gt; 1 spark = SparkSession.builder.master("local").appName("myApp").getOrCreate()

/usr/local/lib/python3.6/site-packages/pyspark/sql/session.py in getOrCreate(self)
    226                             sparkConf.set(key, value)
    227                         # This SparkContext may be an existing one.
--&amp;gt; 228                         sc = SparkContext.getOrCreate(sparkConf)
    229                     # Do not update `SparkConf` for existing `SparkContext`, as it's shared
    230                     # by all sessions.

/usr/local/lib/python3.6/site-packages/pyspark/context.py in getOrCreate(cls, conf)
    390         with SparkContext._lock:
    391             if SparkContext._active_spark_context is None:
--&amp;gt; 392                 SparkContext(conf=conf or SparkConf())
    393             return SparkContext._active_spark_context
    394 

/usr/local/lib/python3.6/site-packages/pyspark/context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    142                 " is not allowed as it is a security risk.")
    143 
--&amp;gt; 144         SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
    145         try:
    146             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

/usr/local/lib/python3.6/site-packages/pyspark/context.py in _ensure_initialized(cls, instance, gateway, conf)
    337         with SparkContext._lock:
    338             if not SparkContext._gateway:
--&amp;gt; 339                 SparkContext._gateway = gateway or launch_gateway(conf)
    340                 SparkContext._jvm = SparkContext._gateway.jvm
    341 

/usr/local/lib/python3.6/site-packages/pyspark/java_gateway.py in launch_gateway(conf, popen_kwargs)
     96                     signal.signal(signal.SIGINT, signal.SIG_IGN)
     97                 popen_kwargs['preexec_fn'] = preexec_func
---&amp;gt; 98                 proc = Popen(command, **popen_kwargs)
     99             else:
    100                 # preexec_fn not supported on Windows

/usr/lib64/python3.6/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors)
    727                                 c2pread, c2pwrite,
    728                                 errread, errwrite,
--&amp;gt; 729                                 restore_signals, start_new_session)
    730         except:
    731             # Cleanup if the child failed starting.

/usr/lib64/python3.6/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
   1362                         if errno_num == errno.ENOENT:
   1363                             err_msg += ': ' + repr(err_filename)
-&amp;gt; 1364                     raise child_exception_type(errno_num, err_msg, err_filename)
   1365                 raise child_exception_type(err_msg)
   1366 

FileNotFoundError: [Errno 2] No such file or directory: '/usr/hdp/current/spark-client/./bin/spark-submit': '/usr/hdp/current/spark-client/./bin/spark-submit'&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think the problem might be connected to the environment variables, but as a novice I don't know.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Global:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;HOSTNAME=sandbox-hdp.hortonworks.com
TERM=xterm
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
SHLVL=1
HOME=/root
container=docker
_=/usr/bin/printenv&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;start_jupyter.sh&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;export SPARK_HOME=/usr/hdp/current/spark-client
export HADOOP_HOME=/usr/hdp/current/hadoop-client
export HADOOP_CONF_DIR=/usr/hdp/current/hadoop-client/conf
export PYTHONPATH="/usr/hdp/current/spark-client/python:/usr/hdp/current/spark-client/python/lib/py4j-0.9-src.zip"
export PYTHONSTARTUP=/usr/hdp/current/spark-client/python/pyspark/shell.py
export PYSPARK_SUBMIT_ARGS="--master yarn-client pyspark-shell"&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is anyone able to point me in the right direction, so that I can create SparkContext?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Many thanks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 31 Aug 2022 12:44:38 GMT</pubDate>
    <dc:creator>Boron</dc:creator>
    <dc:date>2022-08-31T12:44:38Z</dc:date>
    <item>
      <title>Cannot get pyspark to work (Creating Spark Context) with FileNotFoundError: [Errno 2] No such file or directory: '/usr/hdp/current/spark-client/./bin/spark-submit'</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Cannot-get-pyspark-to-work-Creating-Spark-Context-with/m-p/351214#M236195</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am using the cloudera hortonworks sandbox docker image, and have followed this tutorial to run&amp;nbsp; Jupyter notebooks: &lt;A href="https://community.cloudera.com/t5/Support-Questions/Installing-Jupyter-on-sandbox/td-p/201683" target="_self"&gt;https://community.cloudera.com/t5/Support-Questions/Installing-Jupyter-on-sandbox/td-p/201683&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This works. The notebook is started using the python kernal. The error is encountered when attempting to create the spark context:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;FileNotFoundError: [Errno 2] No such file or directory: '/usr/hdp/current/spark-client/./bin/spark-submit'&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;FileNotFoundError                         Traceback (most recent call last)
&amp;lt;ipython-input-4-fbb9eeb69493&amp;gt; in &amp;lt;module&amp;gt;
----&amp;gt; 1 spark = SparkSession.builder.master("local").appName("myApp").getOrCreate()

/usr/local/lib/python3.6/site-packages/pyspark/sql/session.py in getOrCreate(self)
    226                             sparkConf.set(key, value)
    227                         # This SparkContext may be an existing one.
--&amp;gt; 228                         sc = SparkContext.getOrCreate(sparkConf)
    229                     # Do not update `SparkConf` for existing `SparkContext`, as it's shared
    230                     # by all sessions.

/usr/local/lib/python3.6/site-packages/pyspark/context.py in getOrCreate(cls, conf)
    390         with SparkContext._lock:
    391             if SparkContext._active_spark_context is None:
--&amp;gt; 392                 SparkContext(conf=conf or SparkConf())
    393             return SparkContext._active_spark_context
    394 

/usr/local/lib/python3.6/site-packages/pyspark/context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    142                 " is not allowed as it is a security risk.")
    143 
--&amp;gt; 144         SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
    145         try:
    146             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

/usr/local/lib/python3.6/site-packages/pyspark/context.py in _ensure_initialized(cls, instance, gateway, conf)
    337         with SparkContext._lock:
    338             if not SparkContext._gateway:
--&amp;gt; 339                 SparkContext._gateway = gateway or launch_gateway(conf)
    340                 SparkContext._jvm = SparkContext._gateway.jvm
    341 

/usr/local/lib/python3.6/site-packages/pyspark/java_gateway.py in launch_gateway(conf, popen_kwargs)
     96                     signal.signal(signal.SIGINT, signal.SIG_IGN)
     97                 popen_kwargs['preexec_fn'] = preexec_func
---&amp;gt; 98                 proc = Popen(command, **popen_kwargs)
     99             else:
    100                 # preexec_fn not supported on Windows

/usr/lib64/python3.6/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors)
    727                                 c2pread, c2pwrite,
    728                                 errread, errwrite,
--&amp;gt; 729                                 restore_signals, start_new_session)
    730         except:
    731             # Cleanup if the child failed starting.

/usr/lib64/python3.6/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
   1362                         if errno_num == errno.ENOENT:
   1363                             err_msg += ': ' + repr(err_filename)
-&amp;gt; 1364                     raise child_exception_type(errno_num, err_msg, err_filename)
   1365                 raise child_exception_type(err_msg)
   1366 

FileNotFoundError: [Errno 2] No such file or directory: '/usr/hdp/current/spark-client/./bin/spark-submit': '/usr/hdp/current/spark-client/./bin/spark-submit'&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think the problem might be connected to the environment variables, but as a novice I don't know.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Global:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;HOSTNAME=sandbox-hdp.hortonworks.com
TERM=xterm
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
SHLVL=1
HOME=/root
container=docker
_=/usr/bin/printenv&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;start_jupyter.sh&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;export SPARK_HOME=/usr/hdp/current/spark-client
export HADOOP_HOME=/usr/hdp/current/hadoop-client
export HADOOP_CONF_DIR=/usr/hdp/current/hadoop-client/conf
export PYTHONPATH="/usr/hdp/current/spark-client/python:/usr/hdp/current/spark-client/python/lib/py4j-0.9-src.zip"
export PYTHONSTARTUP=/usr/hdp/current/spark-client/python/pyspark/shell.py
export PYSPARK_SUBMIT_ARGS="--master yarn-client pyspark-shell"&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is anyone able to point me in the right direction, so that I can create SparkContext?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Many thanks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 31 Aug 2022 12:44:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Cannot-get-pyspark-to-work-Creating-Spark-Context-with/m-p/351214#M236195</guid>
      <dc:creator>Boron</dc:creator>
      <dc:date>2022-08-31T12:44:38Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot get pyspark to work (Creating Spark Context) with FileNotFoundError: [Errno 2] No such file or directory: '/usr/hdp/current/spark-client/./bin/spark-submit'</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Cannot-get-pyspark-to-work-Creating-Spark-Context-with/m-p/352976#M236621</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/100158"&gt;@Boron&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Could you please set the spark-home environment variable like below before creating spark-session.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;import os
os.environ['SPARK_HOME'] = '/usr/hdp/current/spark-client'&lt;/LI-CODE&gt;&lt;P&gt;&lt;STRONG&gt;Reference:&lt;/STRONG&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;A href="https://stackoverflow.com/questions/55569985/pyspark-could-not-find-valid-spark-home" target="_blank"&gt;https://stackoverflow.com/questions/55569985/pyspark-could-not-find-valid-spark-home&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;&lt;A href="https://stackoverflow.com/questions/40087188/cant-find-spark-submit-when-typing-spark-shell" target="_blank"&gt;https://stackoverflow.com/questions/40087188/cant-find-spark-submit-when-typing-spark-shell&lt;/A&gt;&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Thu, 22 Sep 2022 05:22:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Cannot-get-pyspark-to-work-Creating-Spark-Context-with/m-p/352976#M236621</guid>
      <dc:creator>RangaReddy</dc:creator>
      <dc:date>2022-09-22T05:22:11Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot get pyspark to work (Creating Spark Context) with FileNotFoundError: [Errno 2] No such file or directory: '/usr/hdp/current/spark-client/./bin/spark-submit'</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Cannot-get-pyspark-to-work-Creating-Spark-Context-with/m-p/352999#M236624</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/100158"&gt;@Boron&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;I believe you are using HDP 3.x. Note that there is no Spark 1.x available in HDP 3. We need to use Spark 2.x. Set the SPARK_HOME to Spark 2.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;export SPARK_HOME=/usr/hdp/current/spark2-client&lt;/LI-CODE&gt;</description>
      <pubDate>Thu, 22 Sep 2022 06:38:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Cannot-get-pyspark-to-work-Creating-Spark-Context-with/m-p/352999#M236624</guid>
      <dc:creator>Deepan_N</dc:creator>
      <dc:date>2022-09-22T06:38:06Z</dc:date>
    </item>
  </channel>
</rss>

