- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark shows all jobs completed. iPython still wait
- Labels:
-
Apache Spark
-
Apache YARN
Created on ‎03-07-2017 12:59 PM - edited ‎09-16-2022 04:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am running Ipython -> livy to send jobs to my CDH 5.9.0 cluster running Spark. My job runs through a few operations reading files from HDFS into dataframes and then doing some operations on those dataframes. The code then reaches a cell with a join and stops progressing. If I leave it along for long enough, the session is eventually killed.
I am not sure how to debug this. Yarn shows the job as still running. Spark shows all jobs completed and no active of pending jobs. All the Spark jobs say that they succeeded though some were skipped. If I go to the details for the last stage, all statuses say "Success." The logs for the executors all say Finished task ###. #### bytes sent to driver. The thread dump for the driver shows a lot of waiting threads. If I run the job via pyspark, not through Ipython/Livy, it works fine. But there are no errors in the livy log either.
I'm not sure how to figure this out. Any thoughts?
Thanks!
Created ‎03-08-2017 02:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A bit more info.... (and this is cross-posted in project jupyter list)
