Created 11-02-2018 11:11 PM
When running a Dockerized YARN service, YARN is not providing the correct input arguments.
The service is defined as follows. The entry point in the docker file is ["java", "-jar", "myapp.jar"]. For debugging, it outputs the incoming arguments and exits.
{
  "name": "myapp",
  "version": "1.0.0",
  "description": "myapp",
  "components" :
    [
      {
        "name": "myappcontainers",
        "number_of_containers": 1,
        "artifact": {
          "id": "myapp:1.0-SNAPSHOT",
          "type": "DOCKER"
        },
        "launch_command": "input1 input2",
        "resource": {
          "cpus": 1,
          "memory": "256"
        }
      }
    ]
}Here is the output from YARN:
Launching docker container... Docker run command: /usr/bin/docker run --name=container_e06_1541194419811_0006_01_000026 --user=1015:1015 --net=yarnnetwork -v /hadoop/yarn/local/filecache:/hadoop/yarn/local/filecache:ro -v /hadoop/yarn/local/usercache/admin/filecache:/hadoop/yarn/local/usercache/admin/filecache:ro -v /hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026:/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026 -v /hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0006:/hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0006 --cgroup-parent=/hadoop-yarn/container_e06_1541194419811_0006_01_000026 --cap-drop=ALL --cap-add=SYS_CHROOT --cap-add=MKNOD --cap-add=SETFCAP --cap-add=SETPCAP --cap-add=DAC_READ_SEARCH --cap-add=FSETID --cap-add=SYS_PTRACE --cap-add=CHOWN --cap-add=SYS_ADMIN --cap-add=AUDIT_WRITE --cap-add=SETGID --cap-add=NET_RAW --cap-add=FOWNER --cap-add=SETUID --cap-add=DAC_OVERRIDE --cap-add=KILL --cap-add=NET_BIND_SERVICE --hostname=myappcontainers-3.myapp.admin.EXAMPLE.COM --group-add 1015 --env-file /hadoop/yarn/local/nmPrivate/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/docker.container_e06_1541194419811_0006_01_0000264842430064377299975.env myapp:1.0-SNAPSHOT input1 input2 1>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stderr.txt
Received input: input1 input2 1>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stderr.txt
The program itself is given the redirection commands. Is there a way to disable this behavior?
The only two workarounds I have identified are:
Both of these solutions require repackaging in a way that does not conform to Docker best practice.
Created 11-03-2018 05:00 AM
@Sam Hjelmfelt Running Docker containers which have ENTRYPOINT in YARN Services requires additional configuration to be specified in the service spec. The env variable YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE needs to be set to true. Additionally the launch command parameters are separated with commas instead of space. Try running with the below spec.
{
  "name": "myapp",
  "version": "1.0.0",
  "description": "myapp",
  "components": [
    {
      "name": "myappcontainers",
      "number_of_containers": 1,
      "artifact": {
        "id": "myapp:1.0-SNAPSHOT",
        "type": "DOCKER"
      },
      "launch_command": "input1,input2",
      "resource": {
        "cpus": 1,
        "memory": "256"
      },
      "configuration": {
        "env": {
          "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE": "true"
        }
      }
    }
  ]
}For further reference, refer to the documentation here
Created 11-03-2018 05:00 AM
@Sam Hjelmfelt Running Docker containers which have ENTRYPOINT in YARN Services requires additional configuration to be specified in the service spec. The env variable YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE needs to be set to true. Additionally the launch command parameters are separated with commas instead of space. Try running with the below spec.
{
  "name": "myapp",
  "version": "1.0.0",
  "description": "myapp",
  "components": [
    {
      "name": "myappcontainers",
      "number_of_containers": 1,
      "artifact": {
        "id": "myapp:1.0-SNAPSHOT",
        "type": "DOCKER"
      },
      "launch_command": "input1,input2",
      "resource": {
        "cpus": 1,
        "memory": "256"
      },
      "configuration": {
        "env": {
          "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE": "true"
        }
      }
    }
  ]
}For further reference, refer to the documentation here
Created 11-05-2018 08:29 PM
@Tarun Parimi Thanks for the tip. I set "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE" in the yarn-env.sh, because I did not know I could set it on the service itself. This is a better solution, but does not address the main issue.
Unfortunately, comma separating the input parameters did not help either. Here is the output:
Docker run command: /usr/bin/docker run --name=container_e06_1541194419811_0015_01_000002 --user=1015:1015 --net=yarnnetwork -v /hadoop/yarn/local/filecache:/hadoop/yarn/local/filecache:ro -v /hadoop/yarn/local/usercache/shjelmfelt/filecache:/hadoop/yarn/local/usercache/admin/filecache:ro -v /hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002:/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002 -v /hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0015:/hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0015 --cgroup-parent=/hadoop-yarn/container_e06_1541194419811_0015_01_000002 --cap-drop=ALL --cap-add=SYS_CHROOT --cap-add=MKNOD --cap-add=SETFCAP --cap-add=SETPCAP --cap-add=DAC_READ_SEARCH --cap-add=FSETID --cap-add=SYS_PTRACE --cap-add=CHOWN --cap-add=SYS_ADMIN --cap-add=AUDIT_WRITE --cap-add=SETGID --cap-add=NET_RAW --cap-add=FOWNER --cap-add=SETUID --cap-add=DAC_OVERRIDE --cap-add=KILL --cap-add=NET_BIND_SERVICE --hostname=myappcontainers-0.myapp.admin.EXAMPLE.COM --group-add 1015 --env-file /hadoop/yarn/local/nmPrivate/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/docker.container_e06_1541194419811_0015_01_0000022942289027318724111.env 172.26.224.119:5000/myapp:1.0-SNAPSHOT input1 input2 1>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stderr.txt Received input: input1,input2 1>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stderr.txt
Created 11-06-2018 06:27 AM
@Sam Hjelmfelt I don't think setting YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE in yarn-env.sh, will set the same in the ContainerLaunchContext. Have you tried setting it in the service spec itself to see if it helps ?
Created 11-08-2018 08:32 PM
Comma separating the launch_command fields and setting YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE=true in the service definition rather than yarn-site.env allowed me to use my custom entrypoint as expected in HDP 3.0.1.
Strangely, exporting YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE=true in the yarn-site.env worked fine in my Apache Hadoop 3.1.1 environment, but not in my Ambari-installed HDP 3.0.1 environment. The stdout and stderr redirects are not included in the docker run command in the Apache release. Must be some other setting involved, but I am past my issue, so I will leave it here.
Thanks, @Tarun Parimi!
Created 06-13-2019 06:26 PM
How to pass args to your app inside the docker when you run it with spark-submit like ../bin/spark-submit --master yarn --conf spark.executorEnv.YARN_CONTAINER_RUNTIME_TYPE=docker --conf spark.executorEnv. YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=test .. ?
Created 06-13-2019 06:26 PM
@Tarun Parimi Thanks for the tip. It helped me too.