Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: The Cloudera Community will undergo maintenance on Saturday, August 17 at 12:00am PDT. See more info here.

Spark on CDH6.2 - Issues

Highlighted

Spark on CDH6.2 - Issues

New Contributor

Hello Spark Users,

I have installed CDH6.2 with automated deployment. Also configured default Spark in our cluster. Can you please let me know how should I submit simple python jobs on Spark.

 

We have 12 node cluster. I configured History spark server on one primary node and configured remainining 11 nodes as a Spark Gatewat nodes. Seeing the below message while accessing history server UI.

 

Last updated: 2019-05-06 16:08:48

Client local time zone: America/Los_Angeles

No completed applications found!

Did you specify the correct logging directory? Please verify your setting of spark.history.fs.logDirectory listed above and whether you have the permissions to access it.
It is also possible that your application did not run to completion or did not stop the SparkContext.

 

While launching pyspark getting this: I configured default YARN(MR2) to run spark.

Can you please let me know default spark works with CDH 6.2 or not. and what I am missing here. Please help here. 

 

# pyspark

Python 2.7.5 (default, Apr  9 2019, 14:30:50)

[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

Setting default log level to "WARN".

To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).

 

 

 

 

 

 

 

 

 

 

 

 

^CTraceback (most recent call last):

  File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/shell.py", line 41, in <module>

    spark = SparkSession._create_shell_session()

  File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/sql/session.py", line 584, in _create_shell_session

    return SparkSession.builder\

  File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/sql/session.py", line 173, in getOrCreate

    sc = SparkContext.getOrCreate(sparkConf)

  File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/context.py", line 349, in getOrCreate

    SparkContext(conf=conf or SparkConf())

  File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/context.py", line 118, in __init__

    conf, jsc, profiler_cls)

  File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/context.py", line 180, in _do_init

    self._jsc = jsc or self._initialize_context(self._conf._jconf)

  File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/context.py", line 288, in _initialize_context

    return self._jvm.JavaSparkContext(jconf)

  File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1523, in __call__

  File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command

  File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1152, in send_command

  File "/usr/lib64/python2.7/socket.py", line 447, in readline

    data = self._sock.recv(self._rbufsize)


Thanks,
Chittu