- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Pyspark stuck at Stage 0
- Labels:
-
Apache Spark
Created on ‎12-09-2015 10:47 AM - edited ‎09-16-2022 02:52 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all
I installed Cloudera 5.5 and Spark YARN. I uploaded a small file as below:
Then I ran pyspark as hdfs user and did a simple exercise but it got stuck at Stage 0 as screenshot:
It never returned anything. Can someone point me out a way to troubleshoot and fix this?
Created ‎12-12-2015 10:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Its possible that YARN is unable to allocate a container for the executors, due to too low value of that configuration, in which case things could hang this way. You could raise that config by another 1 GB and restart the cluster/re-run the shell to see if it resolves the issue.
You can also check the Spark AM's log (visit your RM Web UI and click through the RUNNING Spark application, and click on the "logs" link for its Application Master). It may show what it is stuck on, if its yet to spawn up an executor, or if its something else.
Created ‎12-12-2015 10:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Its possible that YARN is unable to allocate a container for the executors, due to too low value of that configuration, in which case things could hang this way. You could raise that config by another 1 GB and restart the cluster/re-run the shell to see if it resolves the issue.
You can also check the Spark AM's log (visit your RM Web UI and click through the RUNNING Spark application, and click on the "logs" link for its Application Master). It may show what it is stuck on, if its yet to spawn up an executor, or if its something else.
Created ‎12-14-2015 09:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found out my YARN deployment got messed up because I didn't add more NodeManager in after I added new hosts manually. Ops!
