- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
We are experiencing an intermittent issue with our Spark load jobs. We use a python to launch multiple Spark Submit jobs which loads data from source files into HDFS. We noticed these Spark submit jobs fails intermittently.
- Labels:
-
Apache Spark
-
Apache YARN
Created ‎02-15-2017 08:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For example, the python launches a number of Spark submit jobs, some of them would fail with this exception. If we re-run the framework to re-launch the failed jobs, some of them may fail again but some of them may succeed. If we keep re-running the failed jobs eventually all of them succeed, this issue is intermittent.
error message in yarn application log:
17/02/13 22:38:55 ERROR yarn.ApplicationMaster: User class threw exception: com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of com.mgl.dh.silfspark.configs.FileCnfg out of VALUE_STRING token
error in yarn node manager log:
2017-02-13 22:38:58,131 WARN nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:launchContainer(313)) - Exception from container-launch with container ID: container_1486736461 720_22516_01_000084 and exit code: 15 ExitCodeException exitCode=15: at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) at org.apache.hadoop.util.Shell.run(Shell.java:456) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:297) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Environment:
HDFS 2.7.1
YARN 2.7.1
Spark 1.5.1
Created ‎02-20-2017 11:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ok, found the problem. Strange but that's what fixed the issue. set mapreduce.input.fileinputformat.split.minsize in mapred config to 64mb (bigger than any json file we have) and this resolved the issue. Seems like json file was getting split , that caused the problem.
Created ‎02-20-2017 11:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ok, found the problem. Strange but that's what fixed the issue. set mapreduce.input.fileinputformat.split.minsize in mapred config to 64mb (bigger than any json file we have) and this resolved the issue. Seems like json file was getting split , that caused the problem.
