Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Loading spark-shell took too long. 5 minutes

avatar
Rising Star

Hi,

When running spark-shell, it took me about 5 minutes to get into this cli

 

spark loading.JPG

There are no errors. Tuned my YARN memory but still having 5 minutes everytime. 

Cluster is small setup about 10 nodes.

Didn't encounter this on a smaller cluster(3nodes), and on bigger cluster(40nodes). 

 

Anyone got idea?it is really appreciated.

3 REPLIES 3

avatar
Master Guru

@muslihuddin There might be network latency between nodes. You may want to enable Debug logging for Spark and then see the logs. Try using iperf, ping, etc. and see if there is low bandwidth and high latency between the edge node and the cluster.

 


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Rising Star

Hi @GangWar . thanks for the suggestion. I tried to ping between the nodes, not seing abnormal latency. 

 

But looks like the loading time solve by changing my java version. 

It now load pretty quick. 

 

Btw the same happen when I'm running simple hdfs command such as

hdfs dfs -ls / . It will take me quite sometimes to list the directory.

 

But once java version changed, it is now looks fine. You think this is the real root cause? 

avatar
Master Guru

@muslihuddin It could be but I can't confirm since we don't have any insight may be the old logs digging can give some clue. I haven't seen this before. 


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.