Support Questions

Find answers, ask questions, and share your expertise

Exit code -1: Invalid mount on running spark with docker and yarn

avatar

Hello everyone.

 

I am trying to train a ML model on a cluster by using a docker image and spark-submit with yarn.

I already tried to follow this process before on a training cluster I made and I succeeded.

 

But when I run it this time, yarn prompts that one of the mounts is invalid.

Of course I tried with Kerborise and without, both didn't work and both runs didn't imply any problems related to connection so we are good from this side.

 

 

This is what I tried:

spark-submit \
--master yarn \
--deploy-mode cluster \
--conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE=docker \
--conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=image:v1 \
--conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS="/etc/passwd:/etc/passwd:ro,/etc/hadoop:/etc/hadoop:ro,/opt/cloudera/parcels/:/opt/cloudera/parcels/:ro,/data01/yarn/nm/:/data01/yarn/nm/:ro,/data02/yarn/nm/:/data02/yarn/nm/:ro,/data03/yarn/nm/:/data03/yarn/nm/:ro,/etc/krb5.conf:/etc/krb5.conf:ro" \
--conf spark.executorEnv.YARN_CONTAINER_RUNTIME_TYPE=docker \
--conf spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=image:v1 \
--conf spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS="/etc/passwd:/etc/passwd:ro,/etc/hadoop:/etc/hadoop:ro,/opt/cloudera/parcels/:/opt/cloudera/parcels/:ro,/data01/yarn/nm/:/data01/yarn/nm/:ro,/data02/yarn/nm/:/data02/yarn/nm/:ro,/data03/yarn/nm/:/data03/yarn/nm/:ro,/etc/krb5.conf:/etc/krb5.conf:ro" \
modeling.py

 

 

And this is the results I got:

 

2022-08-10 17:43:17,694 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1658823376901_2680 failed 2 times due to AM Container for appattempt_1658823376901_2680_000002 exited with exitCode: -1
Failing this attempt.Diagnostics: [2022-08-10 17:43:17.686]Exception from container-launch.
Container id: container_e43_1658823376901_2680_02_000001
Exit code: -1
Exception message: Invalid mount : /data03/yarn/nm
Shell error output: <unknown>
Shell output: <unknown>

[2022-08-10 17:43:17.687]Container exited with a non-zero exit code -1.
[2022-08-10 17:43:17.687]Container exited with a non-zero exit code -1.

 

 

P.S: I followed all the instructions and documentations needed to run this. And made all the necessary configs. Like I said before I already ran this on another cluster.

 

Any help would be greatly appreciated.

 

1 ACCEPTED SOLUTION

avatar

Apparently, having multiple directories for yarn and yarn logs causes a misconfiguration when writing the yarn-site.xml file.

The solution is to go to cloudera manager -> yarn -> Configuration

then search for yarn_service_config_safety_valve 

Add a new one by pressing the plus sign on the right:

 

Name: yarn.nodemanager.runtime.linux.docker.default-rw-mounts

 

Value: /data01/yarn/nm:/data01/yarn/nm,/data02/yarn/nm:/data02/yarn/nm,/data03/yarn/nm:/data03/yarn/nm,/data04/yarn/nm:/data04/yarn/nm,/data01/yarn/container-logs:/data01/yarn/container-logs,/data02/yarn/container-logs:/data02/yarn/container-logs,/data03/yarn/container-logs:/data03/yarn/container-logs,/data04/yarn/container-logs:/data04/yarn/container-logs

 

Of course you have to specify the directories as what fits your configs.

 

View solution in original post

1 REPLY 1

avatar

Apparently, having multiple directories for yarn and yarn logs causes a misconfiguration when writing the yarn-site.xml file.

The solution is to go to cloudera manager -> yarn -> Configuration

then search for yarn_service_config_safety_valve 

Add a new one by pressing the plus sign on the right:

 

Name: yarn.nodemanager.runtime.linux.docker.default-rw-mounts

 

Value: /data01/yarn/nm:/data01/yarn/nm,/data02/yarn/nm:/data02/yarn/nm,/data03/yarn/nm:/data03/yarn/nm,/data04/yarn/nm:/data04/yarn/nm,/data01/yarn/container-logs:/data01/yarn/container-logs,/data02/yarn/container-logs:/data02/yarn/container-logs,/data03/yarn/container-logs:/data03/yarn/container-logs,/data04/yarn/container-logs:/data04/yarn/container-logs

 

Of course you have to specify the directories as what fits your configs.