Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark Submit Issue

avatar
Contributor

When i submit a spark job i am the spark job is shown as failed in Yarn logs but in detailed logs it actually succeeds. Why does the application master is failing after making attempts. I have reduced the number of attempts to 1.

 

User: dev
Name: com.example
Application Type: SPARK
Application Tags:
State: FAILED
FinalStatus: FAILED
Started: Wed Sep 06 09:12:15 -0500 2017
Elapsed: 21sec
Tracking URL: History
Diagnostics:
Application application_1504705933896_0004 failed 1 times due to AM Container for appattempt_1504705933896_0004_000001 exited with exitCode: 0

 

Log Type: stderr
Log Upload Time: Wed Sep 06 09:12:38 -0500 2017
Log Length: 119115
Showing 4096 bytes of 119115 total. Click here for the full log.
heduler: ResultStage 3 (foreachPartition at HBaseContext.scala:216) finished in 0.142 s
17/09/06 09:12:36 INFO scheduler.DAGScheduler: Job 3 finished: foreachPartition at HBaseContext.scala:216, took 0.157439 s
17/09/06 09:12:36 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x15e029e15d2ff0e
17/09/06 09:12:36 INFO zookeeper.ZooKeeper: Session: 0x15e029e15d2ff0e closed
17/09/06 09:12:36 INFO zookeeper.ClientCnxn: EventThread shut down

 

|COL1|COL2|COL3| 
+----------+-------------+-------------+--------------+--------------------------+-------------------------+------------+
| Data is printed here


17/09/06 09:12:18 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT]
17/09/06 09:12:18 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1504705933896_0004_000001
17/09/06 09:12:19 INFO spark.SecurityManager: Changing view acls to: yarn,dev
17/09/06 09:12:19 INFO spark.SecurityManager: Changing modify acls to: yarn,dev
17/09/06 09:12:19 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, dev); users with modify permissions: Set(yarn, dev)
17/09/06 09:12:19 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
17/09/06 09:12:19 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
17/09/06 09:12:36 INFO ingestion: mysamplespark job executed successfully
17/09/06 09:12:36 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0
17/09/06 09:12:36 INFO spark.SparkContext: Invoking stop() from shutdown hook
17/09/06 09:12:36 INFO ui.SparkUI: Stopped Spark web UI at http://10.6.0.10:43467
17/09/06 09:12:37 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/09/06 09:12:37 INFO storage.MemoryStore: MemoryStore cleared
17/09/06 09:12:37 INFO storage.BlockManager: BlockManager stopped
17/09/06 09:12:37 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
17/09/06 09:12:37 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/09/06 09:12:37 INFO spark.SparkContext: Successfully stopped SparkContext
17/09/06 09:12:37 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED
17/09/06 09:12:37 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
17/09/06 09:12:37 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1504705933896_0004
17/09/06 09:12:37 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
17/09/06 09:12:37 INFO util.ShutdownHookManager: Shutdown hook called
17/09/06 09:12:37 INFO util.ShutdownHookManager: Deleting directory /data1/yarn/nm/usercache/dev/appcache/application_1504705933896_0004/spark-de7ee7f3-5e8f-49a2-b99f-37e1c0a0122c
17/09/06 09:12:37 INFO util.ShutdownHookManager: Deleting directory /data0/yarn/nm/usercache/dev/appcache/application_1504705933896_0004/spark-b0ebe098-f97c-49c8-bc0e-317af738619c
17/09/06 09:12:37 INFO util.ShutdownHookManager: Deleting directory /data8/yarn/nm/usercache/dev/appcache/application_1504705933896_0004/container_1504705933896_0004_01_000001/tmp/spark-afaac501-7ff7-42e9-a175-6f3ab2da8465
17/09/06 09:12:37 INFO util.ShutdownHookManager: Deleting directory /data6/yarn/nm/usercache/dev/appcache/application_1504705933896_0004/spark-0508bde0-58ff-42ce-8cca-b15e07551e05
17/09/06 09:12:37 INFO util.ShutdownHookManager: Deleting directory /data4/yarn/nm/usercache/dev/appcache/application_1504705933896_0004/spark-7fd2da4a-1c79-426b-bbbe-16c9c7b1aeaf
17/09/06 09:12:37 INFO util.ShutdownHookManager: Deleting directory /data9/yarn/nm/usercache/dev/appcache/application_1504705933896_0004/spark-c1b3d37f-e394-4049-9874-62a2504c4d6b
17/09/06 09:12:37 INFO util.ShutdownHookManager: Deleting directory /data5/yarn/nm/usercache/dev/appcache/application_1504705933896_0004/spark-ce45d5fd-970b-41cb-9678-855b86254285
17/09/06 09:12:37 INFO util.ShutdownHookManager: Deleting directory /data2/yarn/nm/usercache/dev/appcache/application_1504705933896_0004/spark-53786e4f-f263-4876-926a-ed7b98822930
17/09/06 09:12:37 INFO util.ShutdownHookManager: Deleting directory /data3/yarn/nm/usercache/dev/appcache/application_1504705933896_0004/spark-c7ab8793-95d6-4c21-9eac-a7407375005f
17/09/06 09:12:37 INFO util.ShutdownHookManager: Deleting directory /data7/yarn/nm/usercache/dev/appcache/application_1504705933896_0004/spark-4669faae-b4d1-4ae8-8751-194b476eb6dd
17/09/06 09:12:37 INFO util.ShutdownHookManager: Deleting directory /data8/yarn/nm/usercache/dev/appcache/application_1504705933896_0004/spark-c267166f-0c4b-483d-a7e1-6621afb3eb90
17/09/06 09:12:37 INFO Remoting: Remoting shut down
17/09/06 09:12:37 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.

 

 

 

1 REPLY 1

avatar
Expert Contributor

Hi @Shafiullah,

 

So, Your job completes and still seeing it as failed. right ?

 

Do you see any suspicious messages in full container logs ?

 

 

Thanks,
Sathish (Satz)