<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question pyspark crashes when running locally but works on a cluster in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/pyspark-crashes-when-running-locally-but-works-on-a-cluster/m-p/51720#M33386</link>
    <description>&lt;P&gt;&lt;SPAN&gt;If one run pyspark without arguments on the gateway node on which CM is installed, one gets:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;=========================&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;$ pyspark&lt;BR /&gt;Python 2.7.5 (default, Nov 20 2015, 02:00:19) &amp;nbsp;&lt;BR /&gt;[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2&lt;BR /&gt;Type "help", "copyright", "credits" or "license" for more information.&lt;BR /&gt;Error: Cluster deploy mode is not applicable to Spark shells.&lt;BR /&gt;Run with --help for usage help or --verbose for debug output&lt;BR /&gt;Traceback (most recent call last):&lt;BR /&gt;&amp;nbsp;File "/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark/python/pyspark/shell.py", line 43, in &amp;lt;module&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;sc = SparkContext(pyFiles=add_files)&lt;BR /&gt;&amp;nbsp;File "/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark/python/pyspark/context.py", line 112, in __init__&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;SparkContext._ensure_initialized(self, gateway=gateway)&lt;BR /&gt;&amp;nbsp;File "/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark/python/pyspark/context.py", line 245, in _ensure_initialized&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;SparkContext._gateway = gateway or launch_gateway()&lt;BR /&gt;&amp;nbsp;File "/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark/python/pyspark/java_gateway.py", line 94, in launch_gateway&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;raise Exception("Java gateway process exited before sending the driver its port number")&lt;BR /&gt;Exception: Java gateway process exited before sending the driver its port number&lt;BR /&gt;=========================&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;However, it works on the cluster:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;=========================&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;[ivy2@md01 ~]$ pyspark --master=yarn&lt;BR /&gt;Python 2.7.5 (default, Nov 20 2015, 02:00:19) &amp;nbsp;&lt;BR /&gt;[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2&lt;BR /&gt;Type "help", "copyright", "credits" or "license" for more information.&lt;BR /&gt;Setting default log level to "WARN".&lt;BR /&gt;To adjust logging level use sc.setLogLevel(newLevel).&lt;BR /&gt;17/03/02 14:54:04 WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.&lt;BR /&gt;Welcome to&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;____ &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;__&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;/ __/__ &amp;nbsp;___ _____/ /__&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;_\ \/ _ \/ _ `/ __/ &amp;nbsp;'_/&lt;BR /&gt;&amp;nbsp;&amp;nbsp;/__ / .__/\_,_/_/ /_/\_\ &amp;nbsp;&amp;nbsp;version 1.6.0&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;/_/&lt;BR /&gt;&lt;BR /&gt;Using Python version 2.7.5 (default, Nov 20 2015 02:00:19)&lt;BR /&gt;SparkContext available as sc, HiveContext available as sqlContext.&lt;BR /&gt;&amp;gt;&amp;gt;&amp;gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;=========================&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Any ideas why?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Is this warning OK: "WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. "?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 11:11:04 GMT</pubDate>
    <dc:creator>IgorYakushin</dc:creator>
    <dc:date>2022-09-16T11:11:04Z</dc:date>
  </channel>
</rss>

