Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Job Spark in yarn execution failed
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
Explorer
Created ‎04-28-2020 03:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi community
I am executing a job spark, but after 187 hours of execution it generates the following error:
Application application_1584698544596_3421 failed 2 times due to AM Container for appattempt_1584698544596_3421_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://server1.corp:8088/proxy/application_1584698544596_3421/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e103_1584698544596_3421_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:604)
at org.apache.hadoop.util.Shell.run(Shell.java:507)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.__launchContainer__(LinuxContainerExecutor.java:399)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Shell output: main : command provided 1
main : run as user is development
main : requested yarn user is development
Writing to tmp file /hdfs5/yarn/nm/nmPrivate/application_1584698544596_3421/container_e103_1584698544596_3421_02_000001/container_e103_1584698544596_3421_02_000001.pid.tmp
Writing to cgroup task files...
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
And it is not possible to see the logs either because it generates this error:
Logs not available for container_e103_1584698544596_3421_01_000001. Aggregation may not be complete, Check back later or try the nodemanager at server40.corp:8041
Or see application log at http://server40.corp:8041/node/application/application_1584698544596_3421
Or see application log at http://server40.corp:8041/node/application/application_1584698544596_3421
I appreciate your help, because I don't know what can happen
1 REPLY 1
Cloudera Employee
Created ‎07-29-2020 03:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Wilson,
Exitcode 1 means that it is Failing to initialize the container localizer. Could you please try uploading the application logs if it is available now, else Please share the Resource manager logs and Nodemanager logs to understand what happenned during the container creation.
Thanks
AKR
