Reply
Highlighted
New Contributor
Posts: 3
Registered: ‎11-13-2018

Connect eclipse on windows to CDS 2.3.2 spark

Hi I am conencting eclipse on windows via winutils to cds 2.3.2 on CDH 5.15.

the default spark 1.6.0 is stopped.

 the code calling spark is as under,-

public void testSession() {
	   
	   System.setProperty("SPARK_YARN_MODE", "true");
	   SparkConf sparkConfiguration = new SparkConf();
       sparkConfiguration.setMaster("yarn-client");
       sparkConfiguration.setAppName("test-spark-job");
       sparkConfiguration.setJars(new String[] { "C:\\Work\\workspaces\\SparkJvGradlePOC\\build\\libs\\SparkJvGradlePOC-1.0.jar" });
	   
       sparkConfiguration.set("spark.hadoop.fs.defaultFS", "hdfs://whf00aql");
       sparkConfiguration.set("spark.hadoop.dfs.nameservices", "whf00aql:8020");
       sparkConfiguration.set("spark.hadoop.yarn.resourcemanager.hostname", "whf00aql");
       sparkConfiguration.set("spark.hadoop.yarn.resourcemanager.address", "whf00aql:8032");
       sparkConfiguration.set("spark.hadoop.yarn.application.classpath",
               "$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,"
                       + "$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,"
                       + "$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*,"
                       + "$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*");

   SparkContext sparkContext = new SparkContext(sparkConfiguration);
   JavaSparkContext javaSparkContext = new JavaSparkContext(sparkContext);
	   String str = "Sesison Ok";
	  try{
	   SparkSession sp= SessionSingleton.getSession("TestSession");
	  }
	  catch(Throwable e)
	  {
		  str="Session failed";
	  }
   
      assertEquals("Sesison Ok",str);
   }

----
public class SessionSingleton {

	private static SparkSession sp=null;
	
	public static SparkSession getSession(String SessionCode){
		
		if (String.valueOf(sp).equalsIgnoreCase("null"))
			{
		        System.out.println("creating sparksession");
				SparkSession spark = SparkSession
								  .builder()
					  .appName(SessionCode)
					 // .config("spark.some.config.option", "some-value")
					  //.master("use spark-submit")
					  .enableHiveSupport()
					  .config("spark.sql.warehouse.dir", "target/spark-warehouse")
					  .getOrCreate();
			sp=spark;
			return sp;
			}
		else
		{
			return sp;
		}
	}
	
	
	
	public static void main(String[] args) {
		// TODO Auto-generated method stub
		 try{
			   SparkSession sp= SessionSingleton.getSession("TestSession");
			  }
			  catch(Throwable e)
			  {
				  System.out.println("Failed creating sparksession "+e.getMessage());
				  System.out.println("Failed creating sparksession "+e.toString());
				  System.out.println("Failed creating sparksession "+e.getCause());
			  }
	}

}

the shuffle jar is distributed to all hosts

the shuffle config is as under,-

<property><name>yarn.nodemanager.aux-services</name>
<value>spark_shuffle</value></property>
<property><name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value></property>
<property><name>yarn.resourcemanager.hostname</name>
<value>whf00aql.in.oracle.com</value></property>
<property><name>spark.yarn.shuffle.stopOnFailure</name>
<value>false</value></property>
<property><name>spark.shuffle.service.enabled</name>
<value>true</value></property>
<property><name>yarn.nodemanager.aux-services.spark2_shuffle.classpath</name>
<value>/opt/cloudera/parcels/SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809/lib/spark2/yarn</value>
</property>

the error is

emoryStore started with capacity 873.0 MB
18/11/15 15:45:47 INFO SparkEnv: Registering OutputCommitCoordinator
18/11/15 15:45:47 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/11/15 15:45:48 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://PARAY-IN.in.oracle.com:4040
18/11/15 15:45:48 INFO SparkContext: Added JAR C:\Work\workspaces\SparkJvGradlePOC\build\libs\SparkJvGradlePOC-1.0.jar at spark://PARAY-IN.in.oracle.com:51074/jars/SparkJvGradlePOC-1.0.jar with timestamp 1542276948132
18/11/15 15:45:49 INFO RMProxy: Connecting to ResourceManager at whf00aql/10.184.155.224:8032
18/11/15 15:45:50 INFO Client: Requesting a new application from cluster with 3 NodeManagers
18/11/15 15:45:50 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (2312 MB per container)
18/11/15 15:45:50 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
18/11/15 15:45:50 INFO Client: Setting up container launch context for our AM
18/11/15 15:45:50 INFO Client: Setting up the launch environment for our AM container
18/11/15 15:45:50 INFO Client: Preparing resources for our AM container
18/11/15 15:45:50 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
18/11/15 15:45:50 INFO Client: Uploading resource file:/C:/Users/PARAY/AppD

it is giving cryptic message for file missing but not sure which file.

i can see my code is triggered int eh second stack trace. please help

Announcements