- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark strange behavior: Im executing a query and its working, but the info in the spark page master:8080 dont update
- Labels:
-
Apache Spark
Created ‎05-25-2016 09:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Im executing a query on spark and it is working Im getting the result. I did not configure any cluster so spark should be using its own cluster manager.
But in the spark page: master:8080 I get this:
Alive Workers: 2 Cores in use: 4 Total, 0 Used Memory in use: 6.0 GB Total, 0.0 B Used Applications: 0 Running, 0 Completed Drivers: 0 Running, 0 Completed Status: ALIVE
But when Im executing the query I get the same result while Im refresinh the page:
Alive Workers: 2 Cores in use: 4 Total, 0 Used Memory in use: 6.0 GB Total, 0.0 B Used Applications: 0 Running, 0 Completed Drivers: 0 Running, 0 Completed Status: ALIVE
And after the execution of the query this is the same again...Do you know why? Its very strange, it seems that spark is executing the query without using any hardware which is not possible, so why this info is not updating do you know?
Created ‎05-25-2016 10:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How you are submit job? if you are not specifying --master "spark://masterip:7077" while running spark shell then it will run in local mode.
Created ‎05-25-2016 10:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How you are submit job? if you are not specifying --master "spark://masterip:7077" while running spark shell then it will run in local mode.
Created ‎05-25-2016 10:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Im executing the job on shell. To start shell I use the command "spark-shell". So I need to use spark-shell --master?
Created ‎05-25-2016 10:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, you need to specify the spark master URI.
spark-shell --master spark://masterhost:7077
Created ‎05-25-2016 11:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, but now Im getting this error when I try to execute a query: "16/05/25 12:15:15 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo@5547fcb1)". And this Warn: 16/05/25 12:15:05 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resourcesDo you know why?
Created ‎05-25-2016 11:18 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This warn appears when the query starts to execute in stage 0 and then appears the error.
Created ‎05-25-2016 11:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you please share the screenshot of http://master:8080 UI and command you ran along with full spark-shell logs?
Created ‎05-25-2016 11:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I decrease the memory in spark-env.sh and now it seems that its working, thanks!
Created ‎05-25-2016 11:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just see your comment now, but I think its working fine now, it seems that I was setting more memory than the memory available.
Created ‎05-25-2016 01:45 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Actually, if you don't specify local mode (--master "local") then you will be running in Standalone mode described here:
- Standalone mode: By default, applications submitted to the standalone mode cluster will run in FIFO (first-in-first-out) order, and each application will try to use all available nodes. You can limit the number of nodes an application uses by setting the
spark.cores.max
configuration property in it, or change the default for applications that don’t set this setting throughspark.deploy.defaultCores
. Finally, in addition to controlling cores, each application’sspark.executor.memory
setting controls its memory use.
Also, I think you have the port wrong for the Monitor web interface, try using port 4040 instead of 8080, like this:
