Support Questions
Find answers, ask questions, and share your expertise

Yarn LinuxContainerExecutor with Docker fails to read environment variables

New Contributor

Hello, I set up the LinuxContainerExecutor  in Hadoop 3.1.4 to be able to launch jobs using Docker as the executor container.

 

However, I'm facing some problems.

 

First, If I set the parameters like in the documentation:

-Dmapreduce.map.env.YARN_CONTAINER_RUNTIME_TYPE=docker

-Dmapreduce.map.env.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=$MOUNTS

-Dmapreduce.map.env.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$IMAGE_ID

-Dmapreduce.reduce.env=YARN_CONTAINER_RUNTIME_TYPE=docker

-Dmapreduce.reduce.env=YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=$MOUNTS

-Dmapreduce.reduce.env=YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$IMAGE_ID

 

Those are all ignored and the hadoop job starts without using docker. Also, these environment variables are not appearing in the logs

 

If I set the parameters as a list In the "env":

-Dmapreduce.map.env=YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=$MOUNTS,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$IMAGE_ID

-Dmapreduce.reduce.env=YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=$MOUNTS,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$IMAGE_ID

 

Then the parameters are accepted and appear in the logs, however, for the YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS, only the first directory is mounted, and the rest are all ignored, failing for most of the jobs, as I have to mount directories for different locations and I then the docker can't run with proper configuration.

 

Has anybody faced some issues or there is a known solution?

 

Thank you in advance