Support Questions
Find answers, ask questions, and share your expertise
Alert: Please see the Cloudera blog for information on the Cloudera Response to CVE-2021-4428

killed Spark Streaming job in YARN cluster mode, is listed as incomplete in the History Server

Expert Contributor

Hi Guys,

I am trying to run the Spark Streaming job, which consumes messages from Kafka topic and do further processing on HDP 2.4.2/Spark 1.6.1.

The application works as per required, but when I kill the streaming job, it is still listed as "incomplete" in the Spark History Server UI. In the YARN UI, it is listed as finished. Also when I check using $yarn application -status <appID> it is listed as killed.

I followed one of the thread on HCC forum and saw that there could be an issue with Yarn time line server, but I restarted all the components of Yarn and MR, also restarted Spark HS and resubmitted the application, but again I face the same problem

Here are the steps.

1) Kinit <userName>

2) Submit the SparkStreaming application

3) Check the required calculations happening using spark history server -> executors-> driver logs (works fine)

4) Get the application id and issue $yarn application -kill <appID> (the same user is killing the app who submitted)

5) Wait for some time ( say 2 mins, waited for 120 mins+ didn't help)

6) Check on the YARN RM ( shows Finished/Killed)

7) Click on the Spark History Server.

😎 Ideally this killed application should be listed on the completed application. But at the bottom when we click on the "incomplete" list, we see this application is listed under "incomplete application" table.

9) Going further on HDFS /spark-history directory, it is still shown as <AppID>.inprogress.

( This issue does not occur when we start the same Streaming application in yarn-client mode or other non-streaming applications.)

Will be grateful if anyone can help in understanding the missing bit here.

Thank you,



Expert Contributor

Hi @Smart Solutions

Is this behavior same even if you mention below param in application context?


Expert Contributor

@Jitendra Yadav , its a NetworkWordCount. However, I did pass config argument.

--conf "spark.streaming.stopGracefullyOnShutdown=true" with and without quotes also. Didn't help.

What do you think are we missing here?

Hi @Smart Solutions Along with that property, lets try to send SIGTERM signal to the spark driver and rest it will ensure that application get stopped gracefully. You will see graceful shutdown logs in driver logs.

Node where Driver is running.

ps -ef | grep spark | grep <Spark Driver Name>


Expert Contributor

Hi @Jitendra Yadav,

I am submitting job, killed job using

then try to execute ps -ef | grep spark get pid and issue kill -SIGTERM PID

didn't help.