Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar

Problem

I have a problem with accessing spark history data for killed streaming jobs. The problem looks like the one mentioned here:

When I click on History under Tracking UI for the FINISHED Spark Job I am redirected to: http://dan3.dklocal:18080/history/application_1491987409680_0013/jobs/ - Spark History URL However, when doing the same for the KILLED one I am redirected to: http://dan3.dklocal:8088/cluster/app/application_1491987409680_0012 - application log on RM UI. And the need is to be redirected for the KILLED jobs to Spark History URL.

Solution

Killing the spark job using 'yarn application -kill' is the not the right way of doing it.

When doing: 'yarn application -kill' - the job got killed (status in RM UI is State: KILLED, FinalStatus: KILLED), the '.inprogress' suffix got removed. But the History does not go to Spark History but RM log. When having done: kill -SIGTERM <Spark Driver>, the job also get killed. In RM UI is State: FINISHED, FinalStatus: SUCCEEDED, the '.inprogress' suffix got removed. And the History now goes to Spark History UI.

2,813 Views
0 Kudos
Comments
avatar
Contributor

@Daniel Kozlowski :- The kill solution will work in "client" mode. In cluster mode, driver would be any node of the cluster. Assuming, we dont have ssh access to that node, how can one kill the driver?