Created 11-02-2018 11:11 PM
When running a Dockerized YARN service, YARN is not providing the correct input arguments.
The service is defined as follows. The entry point in the docker file is ["java", "-jar", "myapp.jar"]. For debugging, it outputs the incoming arguments and exits.
{ "name": "myapp", "version": "1.0.0", "description": "myapp", "components" : [ { "name": "myappcontainers", "number_of_containers": 1, "artifact": { "id": "myapp:1.0-SNAPSHOT", "type": "DOCKER" }, "launch_command": "input1 input2", "resource": { "cpus": 1, "memory": "256" } } ] }
Here is the output from YARN:
Launching docker container... Docker run command: /usr/bin/docker run --name=container_e06_1541194419811_0006_01_000026 --user=1015:1015 --net=yarnnetwork -v /hadoop/yarn/local/filecache:/hadoop/yarn/local/filecache:ro -v /hadoop/yarn/local/usercache/admin/filecache:/hadoop/yarn/local/usercache/admin/filecache:ro -v /hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026:/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026 -v /hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0006:/hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0006 --cgroup-parent=/hadoop-yarn/container_e06_1541194419811_0006_01_000026 --cap-drop=ALL --cap-add=SYS_CHROOT --cap-add=MKNOD --cap-add=SETFCAP --cap-add=SETPCAP --cap-add=DAC_READ_SEARCH --cap-add=FSETID --cap-add=SYS_PTRACE --cap-add=CHOWN --cap-add=SYS_ADMIN --cap-add=AUDIT_WRITE --cap-add=SETGID --cap-add=NET_RAW --cap-add=FOWNER --cap-add=SETUID --cap-add=DAC_OVERRIDE --cap-add=KILL --cap-add=NET_BIND_SERVICE --hostname=myappcontainers-3.myapp.admin.EXAMPLE.COM --group-add 1015 --env-file /hadoop/yarn/local/nmPrivate/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/docker.container_e06_1541194419811_0006_01_0000264842430064377299975.env myapp:1.0-SNAPSHOT input1 input2 1>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stderr.txt
Received input: input1 input2 1>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0006/container_e06_1541194419811_0006_01_000026/stderr.txt
The program itself is given the redirection commands. Is there a way to disable this behavior?
The only two workarounds I have identified are:
Both of these solutions require repackaging in a way that does not conform to Docker best practice.
Created 11-03-2018 05:00 AM
@Sam Hjelmfelt Running Docker containers which have ENTRYPOINT in YARN Services requires additional configuration to be specified in the service spec. The env variable YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE needs to be set to true. Additionally the launch command parameters are separated with commas instead of space. Try running with the below spec.
{ "name": "myapp", "version": "1.0.0", "description": "myapp", "components": [ { "name": "myappcontainers", "number_of_containers": 1, "artifact": { "id": "myapp:1.0-SNAPSHOT", "type": "DOCKER" }, "launch_command": "input1,input2", "resource": { "cpus": 1, "memory": "256" }, "configuration": { "env": { "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE": "true" } } } ] }
For further reference, refer to the documentation here
Created 11-03-2018 05:00 AM
@Sam Hjelmfelt Running Docker containers which have ENTRYPOINT in YARN Services requires additional configuration to be specified in the service spec. The env variable YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE needs to be set to true. Additionally the launch command parameters are separated with commas instead of space. Try running with the below spec.
{ "name": "myapp", "version": "1.0.0", "description": "myapp", "components": [ { "name": "myappcontainers", "number_of_containers": 1, "artifact": { "id": "myapp:1.0-SNAPSHOT", "type": "DOCKER" }, "launch_command": "input1,input2", "resource": { "cpus": 1, "memory": "256" }, "configuration": { "env": { "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE": "true" } } } ] }
For further reference, refer to the documentation here
Created 11-05-2018 08:29 PM
@Tarun Parimi Thanks for the tip. I set "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE" in the yarn-env.sh, because I did not know I could set it on the service itself. This is a better solution, but does not address the main issue.
Unfortunately, comma separating the input parameters did not help either. Here is the output:
Docker run command: /usr/bin/docker run --name=container_e06_1541194419811_0015_01_000002 --user=1015:1015 --net=yarnnetwork -v /hadoop/yarn/local/filecache:/hadoop/yarn/local/filecache:ro -v /hadoop/yarn/local/usercache/shjelmfelt/filecache:/hadoop/yarn/local/usercache/admin/filecache:ro -v /hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002:/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002 -v /hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0015:/hadoop/yarn/local/usercache/admin/appcache/application_1541194419811_0015 --cgroup-parent=/hadoop-yarn/container_e06_1541194419811_0015_01_000002 --cap-drop=ALL --cap-add=SYS_CHROOT --cap-add=MKNOD --cap-add=SETFCAP --cap-add=SETPCAP --cap-add=DAC_READ_SEARCH --cap-add=FSETID --cap-add=SYS_PTRACE --cap-add=CHOWN --cap-add=SYS_ADMIN --cap-add=AUDIT_WRITE --cap-add=SETGID --cap-add=NET_RAW --cap-add=FOWNER --cap-add=SETUID --cap-add=DAC_OVERRIDE --cap-add=KILL --cap-add=NET_BIND_SERVICE --hostname=myappcontainers-0.myapp.admin.EXAMPLE.COM --group-add 1015 --env-file /hadoop/yarn/local/nmPrivate/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/docker.container_e06_1541194419811_0015_01_0000022942289027318724111.env 172.26.224.119:5000/myapp:1.0-SNAPSHOT input1 input2 1>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stderr.txt Received input: input1,input2 1>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stdout.txt 2>/hadoop/yarn/log/application_1541194419811_0015/container_e06_1541194419811_0015_01_000002/stderr.txt
Created 11-06-2018 06:27 AM
@Sam Hjelmfelt I don't think setting YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE in yarn-env.sh, will set the same in the ContainerLaunchContext. Have you tried setting it in the service spec itself to see if it helps ?
Created 11-08-2018 08:32 PM
Comma separating the launch_command fields and setting YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE=true in the service definition rather than yarn-site.env allowed me to use my custom entrypoint as expected in HDP 3.0.1.
Strangely, exporting YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE=true in the yarn-site.env worked fine in my Apache Hadoop 3.1.1 environment, but not in my Ambari-installed HDP 3.0.1 environment. The stdout and stderr redirects are not included in the docker run command in the Apache release. Must be some other setting involved, but I am past my issue, so I will leave it here.
Thanks, @Tarun Parimi!
Created 06-13-2019 06:26 PM
How to pass args to your app inside the docker when you run it with spark-submit like ../bin/spark-submit --master yarn --conf spark.executorEnv.YARN_CONTAINER_RUNTIME_TYPE=docker --conf spark.executorEnv. YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=test .. ?
Created 06-13-2019 06:26 PM
@Tarun Parimi Thanks for the tip. It helped me too.