- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 04-24-2017 10:53 AM
Problem
I have a problem with accessing spark history data for killed streaming jobs. The problem looks like the one mentioned here:
When I click on History under Tracking UI for the FINISHED Spark Job I am redirected to: http://dan3.dklocal:18080/history/application_1491987409680_0013/jobs/ - Spark History URL However, when doing the same for the KILLED one I am redirected to: http://dan3.dklocal:8088/cluster/app/application_1491987409680_0012 - application log on RM UI. And the need is to be redirected for the KILLED jobs to Spark History URL.
Solution
Killing the spark job using 'yarn application -kill' is the not the right way of doing it.
When doing: 'yarn application -kill' - the job got killed (status in RM UI is State: KILLED, FinalStatus: KILLED), the '.inprogress' suffix got removed. But the History does not go to Spark History but RM log. When having done: kill -SIGTERM <Spark Driver>, the job also get killed. In RM UI is State: FINISHED, FinalStatus: SUCCEEDED, the '.inprogress' suffix got removed. And the History now goes to Spark History UI.
Created on 07-16-2018 06:22 AM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@Daniel Kozlowski :- The kill solution will work in "client" mode. In cluster mode, driver would be any node of the cluster. Assuming, we dont have ssh access to that node, how can one kill the driver?