I have configured Apache Spark standalone cluster into two Ubuntu 14.04 VMs. One of the VMs i.e. Master and the other one i.e. Worker,both are connected with password less ssh described here. After that from the Master, I have started master as well as worker by the following command from the spark home directory - sbin/start-all.sh Then I run the following command from Master as well as Woker VMs. jps It seemed that the Master and Worker is running properly and also in Web UI, there is no error occured. But when I am trying to run an application using the following command- spark-1.6.0/bin/spark-submit spark.py It gives WARN message in console that- TaskSchedulerImpl : Initial job has not accepted any resources ; check your cluster UI to ensure that workers are registered and have sufficient resources When I am trying to run the application by creating the worker in the same server where the master exists, the application have executed successfully. The WARN message is giving only when the Master and Worker are separated. Both VMs are memory optimized ec2 instances(r3.xlarge) My test application is following - from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext
sc = SparkContext(conf=conf)
SQLCtx = SQLContext(sc)
list_of_list = sc.textFile("ver1_sample.csv").map(lambda line: line.split(",")).collect()
print("type_of_list_of_list===========",type(list_of_list), list_of_list) We have the following port open for both the servers.- 8080 - Master WEB UI Port 7077 - Port for connecting Master and Workers 8888 - Executor Port 8787 - Driver Port 8081 4041-4090 - jvm port Any suggestion regarding this WARN message?
... View more