Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark-sql command line in cluster mode on Sandbox

avatar

Hi, I'm running sandbox 2.3.1. And I'm trying to execute simple count on a hive table. I'm doing it by simply calling

spark-sql --master yarn

and then executing in spark-sql's shell:

select count(*) from sample_08

I'm getting notification:

16/01/13 21:48:02 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

I've tried to follow ideas specified in this StackOverflow question. I've added spark.dynamicAllocation.enabled parameter to custom spark-default configuration via Ambari, but trying to run spark-sql command from bash gave me only info like this:

-- you need to specify number of executors

when I've run spark-sql --master yarn --num-executors 2 response was:

-- you have dynamic Allocation enabled - you cannot put num executors.

Trully "Paragraph 22".

I've also tried running spark-sql in local mode, there were no problems

What do I need to change to finaly run spark-sql ?

1 ACCEPTED SOLUTION

avatar
Expert Contributor

This message will pop up any time an application is requesting more resources from the cluster than the cluster can currently provide. What resources you might ask? Well Spark is only looking for two things: Cores and Ram. Cores represents the number of open executor slots that your cluster provides for execution. Ram refers to the amount of free Ram required on any worker running your application. Note for both of these resources the maximum value is not your System’s max, it is the max as set by the your Spark configuration.

1. Check out the current state of your cluster (and it’s free resources) at SparkMasterIP:7080

2.Make sure you have not started Spark Shell in 2 different terminals.The first Spark shell might consume all the available cores in the system leaving the second shell waiting for resources. Until the first spark shell is terminated and its resources are released, all other apps will display the above warning.

The short term solution to this problem is to make sure you aren’t requesting more resources from your cluster than exist or to shut down any apps that are unnecessarily using resources. If you need to run multiple Spark apps simultaneously then you’ll need to adjust the amount of cores being used by each app.

View solution in original post

1 REPLY 1

avatar
Expert Contributor

This message will pop up any time an application is requesting more resources from the cluster than the cluster can currently provide. What resources you might ask? Well Spark is only looking for two things: Cores and Ram. Cores represents the number of open executor slots that your cluster provides for execution. Ram refers to the amount of free Ram required on any worker running your application. Note for both of these resources the maximum value is not your System’s max, it is the max as set by the your Spark configuration.

1. Check out the current state of your cluster (and it’s free resources) at SparkMasterIP:7080

2.Make sure you have not started Spark Shell in 2 different terminals.The first Spark shell might consume all the available cores in the system leaving the second shell waiting for resources. Until the first spark shell is terminated and its resources are released, all other apps will display the above warning.

The short term solution to this problem is to make sure you aren’t requesting more resources from your cluster than exist or to shut down any apps that are unnecessarily using resources. If you need to run multiple Spark apps simultaneously then you’ll need to adjust the amount of cores being used by each app.