We are experiencing an intermittent issue with our Spark load jobs. We use a python to launch multiple Spark Submit jobs which loads data from source files into HDFS. We noticed these Spark submit jobs fails intermittently.

amitrai2012 — Wed, 15 Feb 2017 16:04:16 GMT

For example, the python launches a number of Spark submit jobs, some of them would fail with this exception. If we re-run the framework to re-launch the failed jobs, some of them may fail again but some of them may succeed. If we keep re-running the failed jobs eventually all of them succeed, this issue is intermittent.

error message in yarn application log:

17/02/13 22:38:55 ERROR yarn.ApplicationMaster: User class threw exception: com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of com.mgl.dh.silfspark.configs.FileCnfg out of VALUE_STRING token

error in yarn node manager log:

2017-02-13 22:38:58,131 WARN nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:launchContainer(313)) - Exception from container-launch with container ID: container_1486736461 720_22516_01_000084 and exit code: 15 ExitCodeException exitCode=15: at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) at org.apache.hadoop.util.Shell.run(Shell.java:456) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:297) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Environment:

HDFS 2.7.1

YARN 2.7.1

Spark 1.5.1

Re: We are experiencing an intermittent issue with our Spark load jobs. We use a python to launch multiple Spark Submit jobs which loads data from source files into HDFS. We noticed these Spark submit jobs fails intermittently.

amitrai2012 — Mon, 20 Feb 2017 19:20:34 GMT

ok, found the problem. Strange but that's what fixed the issue. set mapreduce.input.fileinputformat.split.minsize in mapred config to 64mb (bigger than any json file we have) and this resolved the issue. Seems like json file was getting split , that caused the problem.

question Re: We are experiencing an intermittent issue with our Spark load jobs. We use a python to launch multiple Spark Submit jobs which loads data from source files into HDFS. We noticed these Spark submit jobs fails intermittently. in Archives of Support Questions (Read Only)

We are experiencing an intermittent issue with our Spark load jobs. We use a python to launch multiple Spark Submit jobs which loads data from source files into HDFS. We noticed these Spark submit jobs fails intermittently.

Re: We are experiencing an intermittent issue with our Spark load jobs. We use a python to launch multiple Spark Submit jobs which loads data from source files into HDFS. We noticed these Spark submit jobs fails intermittently.