Member since
05-06-2019
7
Posts
0
Kudos Received
0
Solutions
12-04-2019
10:17 AM
Do I need to put the namenode in safe mode to execute this command? or I can execute this on live cluster? hadoop fs –setrep –w 3 -R /
... View more
12-02-2019
03:21 PM
Hi All, I need to write Spark job output file to NFS mount point from spark2 shell. can you please let me know if there is any way to do it by defining absolute path in Spark2 shell. Thanks, CS
... View more
- Tags:
- nfs
- Spark
- spark-shell
Labels:
12-02-2019
11:47 AM
Hi,
I want to write my Spark jobs output files to NFS share. is there anyway I can send o/p files onto NFS mount point from Spark2 shell?
Many thanks for any pointers.
Regards, CS
... View more
- Tags:
- Spark
Labels:
08-06-2019
04:25 PM
I built Spark2 with CDH 5.16 and able to submit scala jobs with no issues. Now I am able to launch pyspark2 and when I am trying to run simple job, its throwing the below error? can you please suggest on this? what is the alternate to submit python jobs on spark jobs apart from Jupyter notebook? Pls advise
[I 23:08:33.864 NotebookApp] Adapting to protocol v5.1 for kernel f8d7200b-6718-49f6-86e9-c051fb6d84a6
[Stage 0:> (0 + 0) / 2]Exception in thread "dispatcher-event-loop-0" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3236)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
at org.apache.spark.util.ByteBufferOutputStream.write(ByteBufferOutputStream.scala:41)
at java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1853)
Thanks
CS
19/08/06 23:10:41 WARN cluster.YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
[Stage 0:> (0 + 0) / 2]19/08/06 23:10:47 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Requesting driver to remove executor 2 for reason Container marked as failed: container_1565048178604_0033_01_000003 on host: ukvmlx-rdk-22.rms.com. Exit status: 1. Diagnostics: Exception from container-launch.
... View more
Labels:
06-17-2019
04:14 PM
I am unable to launch Spark with YARN Resource manager is not getting started. seeing the below error. Can you please let me know what I am missing.
Spark error:
client.RMProxy: Connecting to ResourceManager at <IP>8032 19/06/17 22:48:57 INFO ipc.Client: Retrying connect to server: <IP>:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 19/06/17 22:48:58 INFO ipc.Client: Ret
Resource Manager Error:
Error starting ResourceManagerjava.lang.NullPointerException at org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:1039) at org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:1036) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1101) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1093) at org.apache.hadoop.ha.ActiveStandbyElector.createWithRetries(ActiveStandbyElector.java:1036) at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:347) at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.serviceInit(ActiveStandbyElectorBasedElectorService.java:110) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:333) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:143
... View more
05-09-2019
11:29 AM
HI All,
I have installed CDH6.2 w/ Spark up and running. Now I need to install Jupiter NB on my cluster. But looks like Cloudera is not supporting Jupiter NB on its stack. Can you pelase let me know if we have any other alternative or other way to get it install on CDH6.2 stack.
Many thanks for any pointers on this.
Thanks, Chittu
... View more
Labels:
05-06-2019
04:33 PM
Hello Spark Users,
I have installed CDH6.2 with automated deployment. Also configured default Spark in our cluster. Can you please let me know how should I submit simple python jobs on Spark.
We have 12 node cluster. I configured History spark server on one primary node and configured remainining 11 nodes as a Spark Gatewat nodes. Seeing the below message while accessing history server UI.
Last updated: 2019-05-06 16:08:48
Client local time zone: America/Los_Angeles
No completed applications found!
Did you specify the correct logging directory? Please verify your setting of spark.history.fs.logDirectory listed above and whether you have the permissions to access it. It is also possible that your application did not run to completion or did not stop the SparkContext.
While launching pyspark getting this: I configured default YARN(MR2) to run spark.
Can you please let me know default spark works with CDH 6.2 or not. and what I am missing here. Please help here.
# pyspark
Python 2.7.5 (default, Apr 9 2019, 14:30:50)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
^CTraceback (most recent call last):
File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/shell.py", line 41, in <module>
spark = SparkSession._create_shell_session()
File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/sql/session.py", line 584, in _create_shell_session
return SparkSession.builder\
File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/sql/session.py", line 173, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/context.py", line 349, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/context.py", line 118, in __init__
conf, jsc, profiler_cls)
File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/context.py", line 180, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/pyspark/context.py", line 288, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1523, in __call__
File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command
File "/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1152, in send_command
File "/usr/lib64/python2.7/socket.py", line 447, in readline
data = self._sock.recv(self._rbufsize)
Thanks, Chittu
... View more