Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

yarn spark container take a lot of time to create

Highlighted

yarn spark container take a lot of time to create

New Contributor

when i run short jobs the container take more time to load and the actual job,

for example it take sometimes a over 60 seconds to start my process because for each core in the computer a new container is generated,


is it possible to configure the nodemanager not to kill the container? and reuse it when the same cpu/ram is requested?

3 REPLIES 3

Re: yarn spark container take a lot of time to create

New Contributor

i found out that the container is reused only while this job still active,, and in my case used between 3-10 times.

can it force it to stay until the resources (cpu/ram) are needed for different (cpu/ram ) requirement across other jobs

Re: yarn spark container take a lot of time to create

Super Mentor

@Ilia K

It indicates that your Cluster might not have enough resource Or you might be running some unwanted services to your cluster. Either increase resources to your cluster nodes like RAM ... Or remove unwanted services from the cluster So that the containers can be started bit fast.

Re: yarn spark container take a lot of time to create

New Contributor

the server node have 32gb ram, and he only accept spark submit jobs (he does not act as client\worker)

each worker node is one of two servers types:

16 core 64GB or 48 core 196GB

and the workers nodes have only Metrics Monitor / NodeManager installed

all the configuration is on default.


when running large job i don't mind the minute hold up, but when running short job should be over under 1 minute (for example 500 jobs (each take 30 seconds on one core) should be over under 1 minute when have enough cpu\ram to allocate,

i think that the problem is the delay of actual job starting time (i can see the process start on by running top on command line on shell on the worker) 30-60 seconds after the submit is received., i see some java tasks manly regarding the creation on the container