Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Loading spark shell takes like 6 mins from my configuration, why?

avatar
New Contributor

Hello,

 

Does anyone know what makes spark take so long to load its shell? spark-shell command takes about 6 mins to load the spark shell for me. I guess this isn't normal. I'm running this at a hadoop cluster made of 4 Raspberry pi 4 with 4GB memory model. Following is the spark-shell loading,

 

Java HotSpot(TM) Server VM warning: You have loaded library /opt/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
2020-07-10 09:05:07,903 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2020-07-10 09:05:47,663 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
Spark context Web UI available at http://pi1:4040
Spark context available as 'sc' (master = yarn, app id = application_1594337770867_0003).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.0.0
/_/

Using Scala version 2.12.10 (Java HotSpot(TM) Server VM, Java 1.8.0_251)
Type in expressions to have them evaluated.
Type :help for more information.

 

 

any suggestion or help is appreciated!

3 REPLIES 3

avatar
Cloudera Employee

Hi,

 

Is this a new Cluster / Configuration setup? how do you identify it was taking 6 mins to load spark shell?

 

Thanks

AKR

avatar
Rising Star
Hey,

Are there any parameters used in the spark-shell command?

Usually, this delay happens for a lot of reasons from a connection time to resource availability. However, we cannot confirm anything with just the driver logs.

In order to narrow this down, could you share the yarn log of the application for this application using the command, "yarn logs -applicationId application_1594337770867_0003"? We will have more clarity on what has been happening during the delay.

Thanks

avatar
Rising Star

Hi @jake_allston , do you find the culprit? I've got similar issue. 

 

Took 5 minutes to load spark-shell. Its a new cluster. Doen not happen to my other cluster