- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark application in incomplete section of spark-history even when complited.
- Labels:
-
Apache Spark
Created 10-30-2022 10:17 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
I`m newbie in spark and all cloud-data workflow, but I have a problem on my new job where I need to work with PySpark and Hadoop.
In my spark-history some applications are "incompleted" for week now. I've tried to kill them, close sparkContext(), kill main .py process, but nothing helped.
For example,
yarn application -status <id>
shows:
...
State: FINISHED
Final-State: SUCCEDED
...
Log Aggregation Status: TIME_OUT
...
But in Spark-History I still see it in incomplete section of my applications. If I open this application there, I can see 1 Active job with 1 Alive executor, but they are doing nothing for all week. This seems like a logging bug, but as I know this problem is only with me, other coworkers doesn't have this problem.
This thread doesn't helped me, because I dont have access to start-history-server.sh.
I suppose this is because of
Log Aggregation Status: TIME_OUT
because my "completed" applications have
Log Aggregation Status: SUCCEDED
What can I do to fix this? Right now I have 80+ incompleted applications in spark-history...
Sorry, for my bad English 😞
Created on 10-30-2022 03:29 PM - edited 10-30-2022 03:29 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
UPD: I've found a clear description of my problem with same situation (yarn, spark, etc.), but there is no solution: https://stackoverflow.com/questions/52126052/what-is-active-jobs-in-spark-history-server-spark-ui-jo...
Created 10-30-2022 11:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @r4ndompuff
Are you able to fetch logs for this application from command line?
yarn logs -applicationId <app_id> -appOwner <user>
Possibly, when there are huge number of application count stored that is expected to cause this issue. In general, large /tmp/logs (yarn.nodemanager.remote-app-log-dir) HDFS directory causes YARN log aggregation to time out.
Regarding killing application, this must be code level issue you need to check if sc.close() method has been called at correct place.
Thanks!
Created 10-31-2022 01:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, @AsimShaikh!
Thank you very much for your answer!
No, this command is not working for me, I can see only the error that my account don`t have access to the server with logs...
But I've found a root of my problem:
From Spark Monitoring and Instrumentation:
... 3. Applications which exited without registering themselves as completed will be listed as incomplete --even though they are no longer running. This can happen if an application crashes...
I am really restarting kernel in JH quite often, because we have unstable system right now (we are moving from office to another).
Can I just mark incomplete applications as complete somehow or I need to write to somebody who have access to spark logs folder?
Created 10-31-2022 03:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You may need to explicitly stop the SparkContext sc by calling sc.stop.
it's a good idea to call sc.stop(), which lets the spark master know that your application is finished consuming resources. If you don't call sc.stop(), the event log information that is used by the history server will be incomplete, and your application will not show up in the history server's UI.
Created 10-31-2022 05:03 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My pc in office was rebooted many times, I don't have opened session with initial SparkContext.
I've tried to create one more and call sc.stop(), but this is not helped 😞
Created 11-01-2022 05:03 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You have sample code which you can share?
