Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Connect eclipse on windows to CDS 2.3.2 spark

Connect eclipse on windows to CDS 2.3.2 spark

New Contributor

Hi I am conencting eclipse on windows via winutils to cds 2.3.2 on CDH 5.15.

the default spark 1.6.0 is stopped.

 the code calling spark is as under,-

public void testSession() {
	   
	   System.setProperty("SPARK_YARN_MODE", "true");
	   SparkConf sparkConfiguration = new SparkConf();
       sparkConfiguration.setMaster("yarn-client");
       sparkConfiguration.setAppName("test-spark-job");
       sparkConfiguration.setJars(new String[] { "C:\\Work\\workspaces\\SparkJvGradlePOC\\build\\libs\\SparkJvGradlePOC-1.0.jar" });
	   
       sparkConfiguration.set("spark.hadoop.fs.defaultFS", "hdfs://whf00aql");
       sparkConfiguration.set("spark.hadoop.dfs.nameservices", "whf00aql:8020");
       sparkConfiguration.set("spark.hadoop.yarn.resourcemanager.hostname", "whf00aql");
       sparkConfiguration.set("spark.hadoop.yarn.resourcemanager.address", "whf00aql:8032");
       sparkConfiguration.set("spark.hadoop.yarn.application.classpath",
               "$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,"
                       + "$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,"
                       + "$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*,"
                       + "$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*");

   SparkContext sparkContext = new SparkContext(sparkConfiguration);
   JavaSparkContext javaSparkContext = new JavaSparkContext(sparkContext);
	   String str = "Sesison Ok";
	  try{
	   SparkSession sp= SessionSingleton.getSession("TestSession");
	  }
	  catch(Throwable e)
	  {
		  str="Session failed";
	  }
   
      assertEquals("Sesison Ok",str);
   }

----
public class SessionSingleton {

	private static SparkSession sp=null;
	
	public static SparkSession getSession(String SessionCode){
		
		if (String.valueOf(sp).equalsIgnoreCase("null"))
			{
		        System.out.println("creating sparksession");
				SparkSession spark = SparkSession
								  .builder()
					  .appName(SessionCode)
					 // .config("spark.some.config.option", "some-value")
					  //.master("use spark-submit")
					  .enableHiveSupport()
					  .config("spark.sql.warehouse.dir", "target/spark-warehouse")
					  .getOrCreate();
			sp=spark;
			return sp;
			}
		else
		{
			return sp;
		}
	}
	
	
	
	public static void main(String[] args) {
		// TODO Auto-generated method stub
		 try{
			   SparkSession sp= SessionSingleton.getSession("TestSession");
			  }
			  catch(Throwable e)
			  {
				  System.out.println("Failed creating sparksession "+e.getMessage());
				  System.out.println("Failed creating sparksession "+e.toString());
				  System.out.println("Failed creating sparksession "+e.getCause());
			  }
	}

}

the shuffle jar is distributed to all hosts

the shuffle config is as under,-

<property><name>yarn.nodemanager.aux-services</name>
<value>spark_shuffle</value></property>
<property><name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value></property>
<property><name>yarn.resourcemanager.hostname</name>
<value>whf00aql.in.oracle.com</value></property>
<property><name>spark.yarn.shuffle.stopOnFailure</name>
<value>false</value></property>
<property><name>spark.shuffle.service.enabled</name>
<value>true</value></property>
<property><name>yarn.nodemanager.aux-services.spark2_shuffle.classpath</name>
<value>/opt/cloudera/parcels/SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809/lib/spark2/yarn</value>
</property>

the error is

emoryStore started with capacity 873.0 MB
18/11/15 15:45:47 INFO SparkEnv: Registering OutputCommitCoordinator
18/11/15 15:45:47 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/11/15 15:45:48 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://PARAY-IN.in.oracle.com:4040
18/11/15 15:45:48 INFO SparkContext: Added JAR C:\Work\workspaces\SparkJvGradlePOC\build\libs\SparkJvGradlePOC-1.0.jar at spark://PARAY-IN.in.oracle.com:51074/jars/SparkJvGradlePOC-1.0.jar with timestamp 1542276948132
18/11/15 15:45:49 INFO RMProxy: Connecting to ResourceManager at whf00aql/10.184.155.224:8032
18/11/15 15:45:50 INFO Client: Requesting a new application from cluster with 3 NodeManagers
18/11/15 15:45:50 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (2312 MB per container)
18/11/15 15:45:50 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
18/11/15 15:45:50 INFO Client: Setting up container launch context for our AM
18/11/15 15:45:50 INFO Client: Setting up the launch environment for our AM container
18/11/15 15:45:50 INFO Client: Preparing resources for our AM container
18/11/15 15:45:50 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
18/11/15 15:45:50 INFO Client: Uploading resource file:/C:/Users/PARAY/AppD

it is giving cryptic message for file missing but not sure which file.

i can see my code is triggered int eh second stack trace. please help

Don't have an account?
Coming from Hortonworks? Activate your account here