Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Docker on YARN Incorrect Program Arguments

avatar
Rising Star

When running a Dockerized YARN service, YARN is not providing the correct input arguments.

The service is defined as follows. The entry point in the docker file is ["java", "-jar", "myapp.jar"]. For debugging, it outputs the incoming arguments and exits.

{
  "name": "myapp",
  "version": "1.0.0",
  "description": "myapp",
  "components" :
    [
      {
        "name": "myappcontainers",
        "number_of_containers": 1,
        "artifact": {
          "id": "myapp:1.0-SNAPSHOT",
          "type": "DOCKER"
        },
        "launch_command": "input1 input2",
        "resource": {
          "cpus": 1,
          "memory": "256"
        }
      }
    ]
}

Here is the output from YARN:

Launching docker container...

Docker run command: /usr/bin/docker run --name=container_e06_1541194419811_0006_01_000026 --user=1015:1015 --net=yarnnetwork -v /hadoop/yarn/local/filecache:/hadoop/yarn/local/filecache:ro -v /hadoop/yarn/local/usercache/admin/filecache:/hadoop/yarn/local/usercache/admin/filecache:ro -v /hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026:/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026 -v /hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0006:/hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0006 --cgroup-parent=/hadoop-yarn/container_e06_1541194419811_0006_01_000026 --cap-drop=ALL --cap-add=SYS_CHROOT --cap-add=MKNOD --cap-add=SETFCAP --cap-add=SETPCAP --cap-add=DAC_READ_SEARCH --cap-add=FSETID --cap-add=SYS_PTRACE --cap-add=CHOWN --cap-add=SYS_ADMIN --cap-add=AUDIT_WRITE --cap-add=SETGID --cap-add=NET_RAW --cap-add=FOWNER --cap-add=SETUID --cap-add=DAC_OVERRIDE --cap-add=KILL --cap-add=NET_BIND_SERVICE --hostname=myappcontainers-3.myapp.admin.EXAMPLE.COM --group-add 1015 --env-file /hadoop/yarn/local/nmPrivate/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/docker.container_e06_1541194419811_0006_01_0000264842430064377299975.env myapp:1.0-SNAPSHOT input1 input2 1>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stderr.txt
Received input: input1 input2 1>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stderr.txt

The program itself is given the redirection commands. Is there a way to disable this behavior?

The only two workarounds I have identified are:

  1. Change the ENTRYPOINT in the dockerfile to be ["sh", "-c"] and the launch_command to "java -jar myjar.jar"
  2. Change the program to use or ignore the "1>" and "2>" inputs

Both of these solutions require repackaging in a way that does not conform to Docker best practice.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Sam Hjelmfelt Running Docker containers which have ENTRYPOINT in YARN Services requires additional configuration to be specified in the service spec. The env variable YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE needs to be set to true. Additionally the launch command parameters are separated with commas instead of space. Try running with the below spec.

{
  "name": "myapp",
  "version": "1.0.0",
  "description": "myapp",
  "components": [
    {
      "name": "myappcontainers",
      "number_of_containers": 1,
      "artifact": {
        "id": "myapp:1.0-SNAPSHOT",
        "type": "DOCKER"
      },
      "launch_command": "input1,input2",
      "resource": {
        "cpus": 1,
        "memory": "256"
      },
      "configuration": {
        "env": {
          "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE": "true"
        }
      }
    }
  ]
}

For further reference, refer to the documentation here

View solution in original post

6 REPLIES 6

avatar
Expert Contributor

@Sam Hjelmfelt Running Docker containers which have ENTRYPOINT in YARN Services requires additional configuration to be specified in the service spec. The env variable YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE needs to be set to true. Additionally the launch command parameters are separated with commas instead of space. Try running with the below spec.

{
  "name": "myapp",
  "version": "1.0.0",
  "description": "myapp",
  "components": [
    {
      "name": "myappcontainers",
      "number_of_containers": 1,
      "artifact": {
        "id": "myapp:1.0-SNAPSHOT",
        "type": "DOCKER"
      },
      "launch_command": "input1,input2",
      "resource": {
        "cpus": 1,
        "memory": "256"
      },
      "configuration": {
        "env": {
          "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE": "true"
        }
      }
    }
  ]
}

For further reference, refer to the documentation here

avatar
Rising Star

@Tarun Parimi Thanks for the tip. I set "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE" in the yarn-env.sh, because I did not know I could set it on the service itself. This is a better solution, but does not address the main issue.

Unfortunately, comma separating the input parameters did not help either. Here is the output:

Docker run command: /usr/bin/docker run --name=container_e06_1541194419811_0015_01_000002 --user=1015:1015 --net=yarnnetwork -v /hadoop/yarn/local/filecache:/hadoop/yarn/local/filecache:ro -v /hadoop/yarn/local/usercache/shjelmfelt/filecache:/hadoop/yarn/local/usercache/admin/filecache:ro -v /hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002:/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002 -v /hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0015:/hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0015 --cgroup-parent=/hadoop-yarn/container_e06_1541194419811_0015_01_000002 --cap-drop=ALL --cap-add=SYS_CHROOT --cap-add=MKNOD --cap-add=SETFCAP --cap-add=SETPCAP --cap-add=DAC_READ_SEARCH --cap-add=FSETID --cap-add=SYS_PTRACE --cap-add=CHOWN --cap-add=SYS_ADMIN --cap-add=AUDIT_WRITE --cap-add=SETGID --cap-add=NET_RAW --cap-add=FOWNER --cap-add=SETUID --cap-add=DAC_OVERRIDE --cap-add=KILL --cap-add=NET_BIND_SERVICE --hostname=myappcontainers-0.myapp.admin.EXAMPLE.COM --group-add 1015 --env-file /hadoop/yarn/local/nmPrivate/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/docker.container_e06_1541194419811_0015_01_0000022942289027318724111.env 172.26.224.119:5000/myapp:1.0-SNAPSHOT input1 input2 1>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stderr.txt 
Received input: input1,input2 1>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stderr.txt

avatar
Expert Contributor

@Sam Hjelmfelt I don't think setting YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE in yarn-env.sh, will set the same in the ContainerLaunchContext. Have you tried setting it in the service spec itself to see if it helps ?

avatar
Rising Star

Comma separating the launch_command fields and setting YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE=true in the service definition rather than yarn-site.env allowed me to use my custom entrypoint as expected in HDP 3.0.1.

Strangely, exporting YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE=true in the yarn-site.env worked fine in my Apache Hadoop 3.1.1 environment, but not in my Ambari-installed HDP 3.0.1 environment. The stdout and stderr redirects are not included in the docker run command in the Apache release. Must be some other setting involved, but I am past my issue, so I will leave it here.

Thanks, @Tarun Parimi!

avatar
New Contributor

How to pass args to your app inside the docker when you run it with spark-submit like ../bin/spark-submit --master yarn --conf spark.executorEnv.YARN_CONTAINER_RUNTIME_TYPE=docker --conf spark.executorEnv. YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=test .. ?

avatar
New Contributor

@Tarun Parimi Thanks for the tip. It helped me too.