Support Questions

Find answers, ask questions, and share your expertise

Spark strange behavior: Im executing a query and its working, but the info in the spark page master:8080 dont update

avatar
Rising Star

Im executing a query on spark and it is working Im getting the result. I did not configure any cluster so spark should be using its own cluster manager.

But in the spark page: master:8080 I get this:

Alive Workers: 2 Cores in use: 4 Total, 0 Used Memory in use: 6.0 GB Total, 0.0 B Used Applications: 0 Running, 0 Completed Drivers: 0 Running, 0 Completed Status: ALIVE

But when Im executing the query I get the same result while Im refresinh the page:

Alive Workers: 2
Cores in use: 4 Total, 0 Used
Memory in use: 6.0 GB Total, 0.0 B Used
Applications: 0 Running, 0 Completed
Drivers: 0 Running, 0 Completed
Status: ALIVE

And after the execution of the query this is the same again...Do you know why? Its very strange, it seems that spark is executing the query without using any hardware which is not possible, so why this info is not updating do you know?

1 ACCEPTED SOLUTION

avatar
Super Guru

@John Cod

How you are submit job? if you are not specifying --master "spark://masterip:7077" while running spark shell then it will run in local mode.

View solution in original post

12 REPLIES 12

avatar
Rising Star

Hi, thanks for your answer. But Im not understanding. I think the answer that I accpted fixed the issue. Because starting the spark-shell with spark-shell --master spark://masterhost:7077 in the 8080 port I get:

  • Cores in use: 4 Total, 4 Used
  • Memory in use: 4.0 GB Total, 2.0 GB Used
  • Applications: 1 Running, 0 Completed
  • Drivers: 0 Running, 0 Completed
  • Status: ALIVE

So it seems that it is already working starting spark-shell with thay way, right? But you are suggesting that should be spark-shell --master "local" spark:///mastehost:7077?

avatar
Master Guru

was there anything on the spark history server or in logs.

avatar

Spark supports the following cluster modes:

  1. Pseudo Cluster mode (everything runs on one node) - For debugging/developing Spark
  2. Standalone: Spark provides cluster manager facilities
  3. Spark on YARN : YARN provides Cluster manager facilities.
    1. yarn-client mode: Spark Driver runs outside YARN
    2. yarn-cluster mode: Drivers also runs in YARN
  4. Spark on Mesos : Mesos provides Cluster manager facilities

We don't supprot Spark on Mesos.

For Spark on YARN specify mode by adding --master yarn-client or --master yarn-cluster on your Spark-submit command on a per job basis. Or configure it in spark-defaults.conf for all jobs submitted from that node.

--master "spark://masterip:7077" indicates Spark standalone mode.