Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

We are experiencing an intermittent issue with our Spark load jobs. We use a python to launch multiple Spark Submit jobs which loads data from source files into HDFS. We noticed these Spark submit jobs fails intermittently.

avatar
New Member

For example, the python launches a number of Spark submit jobs, some of them would fail with this exception. If we re-run the framework to re-launch the failed jobs, some of them may fail again but some of them may succeed. If we keep re-running the failed jobs eventually all of them succeed, this issue is intermittent.

error message in yarn application log:

17/02/13 22:38:55 ERROR yarn.ApplicationMaster: User class threw exception: com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of com.mgl.dh.silfspark.configs.FileCnfg out of VALUE_STRING token

error in yarn node manager log:

2017-02-13 22:38:58,131 WARN nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:launchContainer(313)) - Exception from container-launch with container ID: container_1486736461 720_22516_01_000084 and exit code: 15 ExitCodeException exitCode=15: at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) at org.apache.hadoop.util.Shell.run(Shell.java:456) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:297) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Environment:

HDFS 2.7.1

YARN 2.7.1

Spark 1.5.1

1 ACCEPTED SOLUTION

avatar
New Member

ok, found the problem. Strange but that's what fixed the issue. set mapreduce.input.fileinputformat.split.minsize in mapred config to 64mb (bigger than any json file we have) and this resolved the issue. Seems like json file was getting split , that caused the problem.

View solution in original post

1 REPLY 1

avatar
New Member

ok, found the problem. Strange but that's what fixed the issue. set mapreduce.input.fileinputformat.split.minsize in mapred config to 64mb (bigger than any json file we have) and this resolved the issue. Seems like json file was getting split , that caused the problem.