When running spark-shell, it took me about 5 minutes to get into this cli
There are no errors. Tuned my YARN memory but still having 5 minutes everytime.
Cluster is small setup about 10 nodes.
Didn't encounter this on a smaller cluster(3nodes), and on bigger cluster(40nodes).
Anyone got idea?it is really appreciated.
@muslihuddin There might be network latency between nodes. You may want to enable Debug logging for Spark and then see the logs. Try using iperf, ping, etc. and see if there is low bandwidth and high latency between the edge node and the cluster.
Hi @GangWar . thanks for the suggestion. I tried to ping between the nodes, not seing abnormal latency.
But looks like the loading time solve by changing my java version.
It now load pretty quick.
Btw the same happen when I'm running simple hdfs command such as
hdfs dfs -ls / . It will take me quite sometimes to list the directory.
But once java version changed, it is now looks fine. You think this is the real root cause?
@muslihuddin It could be but I can't confirm since we don't have any insight may be the old logs digging can give some clue. I haven't seen this before.