Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark strange behavior: Im executing a query and its working, but the info in the spark page master:8080 dont update

avatar
Rising Star

Im executing a query on spark and it is working Im getting the result. I did not configure any cluster so spark should be using its own cluster manager.

But in the spark page: master:8080 I get this:

Alive Workers: 2 Cores in use: 4 Total, 0 Used Memory in use: 6.0 GB Total, 0.0 B Used Applications: 0 Running, 0 Completed Drivers: 0 Running, 0 Completed Status: ALIVE

But when Im executing the query I get the same result while Im refresinh the page:

Alive Workers: 2
Cores in use: 4 Total, 0 Used
Memory in use: 6.0 GB Total, 0.0 B Used
Applications: 0 Running, 0 Completed
Drivers: 0 Running, 0 Completed
Status: ALIVE

And after the execution of the query this is the same again...Do you know why? Its very strange, it seems that spark is executing the query without using any hardware which is not possible, so why this info is not updating do you know?

1 ACCEPTED SOLUTION

avatar
Super Guru

@John Cod

How you are submit job? if you are not specifying --master "spark://masterip:7077" while running spark shell then it will run in local mode.

View solution in original post

12 REPLIES 12

avatar
Super Guru

@John Cod

How you are submit job? if you are not specifying --master "spark://masterip:7077" while running spark shell then it will run in local mode.

avatar
Rising Star

Hi, Im executing the job on shell. To start shell I use the command "spark-shell". So I need to use spark-shell --master?

avatar
Super Guru

@John Cod

Yes, you need to specify the spark master URI.

spark-shell --master spark://masterhost:7077

avatar
Rising Star

Thanks, but now Im getting this error when I try to execute a query: "16/05/25 12:15:15 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo@5547fcb1)". And this Warn: 16/05/25 12:15:05 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resourcesDo you know why?

avatar
Rising Star

This warn appears when the query starts to execute in stage 0 and then appears the error.

avatar
Super Guru
@John Cod

Can you please share the screenshot of http://master:8080 UI and command you ran along with full spark-shell logs?

avatar
Rising Star

I decrease the memory in spark-env.sh and now it seems that its working, thanks!

avatar
Rising Star

I just see your comment now, but I think its working fine now, it seems that I was setting more memory than the memory available.

avatar

Actually, if you don't specify local mode (--master "local") then you will be running in Standalone mode described here:

  • Standalone mode: By default, applications submitted to the standalone mode cluster will run in FIFO (first-in-first-out) order, and each application will try to use all available nodes. You can limit the number of nodes an application uses by setting the spark.cores.maxconfiguration property in it, or change the default for applications that don’t set this setting through spark.deploy.defaultCores. Finally, in addition to controlling cores, each application’s spark.executor.memory setting controls its memory use.

Also, I think you have the port wrong for the Monitor web interface, try using port 4040 instead of 8080, like this:

http://<driver-node>:4040