Created 03-16-2017 07:43 PM
My Scenario
I would like to expose a java micro service <Springboot appln> which should eventually run a spark submit to yield the required results,typically as a on demand service
I have been allotted with 2 data nodes and 1 edge node for development, where this edge node has the micro services deployed. When I tried yarn-cluster, got an exception 'Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit.'
Help me to get an ideal way to deal with it. What should be the approach to be looked at? Since the service is on demand, I cannot deal with YARN Client to have more Main Class than one which is already used up for springboot starter.
Codes here
MicroServiceController.java:
@RequestMapping(value = "/transform", method = RequestMethod.POST, consumes = MediaType.APPLICATION_JSON_VALUE, produces = MediaType.APPLICATION_JSON_VALUE)
public String initiateTransformation(@RequestBody TransformationRequestVO requestVO){
PublicationProcessor.run();
return "SUCCESS";
}
PublicationProcessor.java
public static void run() {
try{
SparkConf sC = new SparkConf().setAppName("NPUB_TRANSFORMATION_US") .setMaster("yarn-clsuter") .set("spark.executor.instances", PropertyBundle.getConfigurationValue("spark.executor.instances")) .set("spark.executor.cores", PropertyBundle.getConfigurationValue("spark.executor.cores")) .set("spark.driver.memory",PropertyBundle.getConfigurationValue("spark.driver.memory")) .set("spark.executor.memory",PropertyBundle.getConfigurationValue("spark.executor.memory")) .set("spark.driver.maxResultSize", PropertyBundle.getConfigurationValue("spark.driver.maxResultSize")) .set("spark.network.timeout",PropertyBundle.getConfigurationValue("spark.network.timeout"));
JavaSparkContext jSC = new JavaSparkContext(sC);
sqlContext = new SQLContext(jSC);
processTransformation();
}catch(Exception e){
System.out.println("REQUEST ABORTED..."+e.getMessage());
}
}
,
Created 03-16-2017 09:09 PM
@Faisal R Ahamed, You should use spark-submit to run this application. While running application specify --master yarn and --deploy-mode cluster. Specifying to spark conf is too late to switch to yarn-cluster mode.
spark-submit --class <clasname> --master yarn --deploy-mode cluster <jars> <args>
https://www.mail-archive.com/user@spark.apache.org/msg57869.html
Created 03-22-2017 06:31 AM
local[*]
new SparkConf() .setMaster("local[2]")
yarn-client
--master yarn --deploy-mode client
yarn-cluster
--master yarn --deploy-mode cluster