Support Questions

Find answers, ask questions, and share your expertise

Spark using all resources on Cluster

avatar
Contributor

Hello,

I put a small script to work on Spark (installed alongside the HDP suite, through Ambari). The script took more than 5 hours to run and ended up failing due to "IOError: [Errno 28] No space left on device".

Whenever I start the pyspark shell or run a script using spark-submit, some warnings are shown that I'm afraid may be a good indicator to know how to solve the trouble I'm facing here. As the resources spark may be using can be only from the host in which I'm running the client and may be ignoring the cluster.

16/09/23 23:26:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/09/23 23:26:52 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.

Thanks in advance,

1 ACCEPTED SOLUTION

avatar
Super Guru

@Luis Valdeavellano

Those warnings have nothing to do with your issue, but it is good to fix them anyway.

1. If you don't want your job to use all the resources from the cluster, then define a separate YARN queue for Spark jobs and submit it to that queue. I already assume that you submit Spark job via YARN. Obviously, your job will max out that Spark queue resources, but the resources not assigned to that queue could still be used by others. You still have your problem, but others can still execute jobs.

2. Look at your job and determine why is using so much resources, redesign it, tune it, break it in small pieces, etc. If the job is well tuned then your cluster does not have enough resources. Check resource use during the execution of the job to determine the bottleneck (RAM, CPU, etc).

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

How are you launching the spark job?

FYI:

The warning are just letting you know that you haven't setup the environment variables. You can fix that by checking your environment settings and correcting them.

I'm pretty sure something like below would remove the warnings.

export LD_LIBRARY_PATH=/usr/local/hadoop/lib/native/:$LD_LIBRARY_PATH 

avatar
Super Guru

@Luis Valdeavellano

Those warnings have nothing to do with your issue, but it is good to fix them anyway.

1. If you don't want your job to use all the resources from the cluster, then define a separate YARN queue for Spark jobs and submit it to that queue. I already assume that you submit Spark job via YARN. Obviously, your job will max out that Spark queue resources, but the resources not assigned to that queue could still be used by others. You still have your problem, but others can still execute jobs.

2. Look at your job and determine why is using so much resources, redesign it, tune it, break it in small pieces, etc. If the job is well tuned then your cluster does not have enough resources. Check resource use during the execution of the job to determine the bottleneck (RAM, CPU, etc).