Reply
New Contributor
Posts: 5
Registered: ‎08-28-2018
Accepted Solution

SparkListener in Spark on YARN-CLUSTER not works?

My main purpose is to get the appId after submitting the yarn-cluster task through java code, which is convenient for more business operations.

 

Add the "--conf=spark.extraListeners=Mylistener"

 

While SparkListener does work when I use Spark in standalone mode, it doesn't work when I run Spark on a cluster over Yarn. Is it possible for SparkListener to work when running over Yarn? If so, what steps should I do to enable that?

 

Here is the Mylistener class code:

    public class Mylistener extends SparkListener {

    private static Logger logger = LoggerFactory.getLogger(EnvelopeSparkListener.class);
    @Override
    public void onApplicationStart(SparkListenerApplicationStart sparkListenerApplicationStart) {
        Option<String> appId = sparkListenerApplicationStart.appId();
        EnvelopeSubmit.appId = appId.get();
        logger.info("====================start");
    }
    @Override
    public void onBlockManagerAdded(SparkListenerBlockManagerAdded blockManagerAdded) {
       logger.info("=====================add");
    }
}

Here is the Main class to submit the applicaiton:

      public static void main(String[] args) {
        String jarpath = args[0];
        String childArg = args[1];
        System.out.println("jarpath:" + jarpath);
        System.out.println("childArg:" + childArg);
        System.setProperty("HADOOP_USER_NAME", "hdfs");
        String[] arg = {"--verbose=true", "--class=com.cloudera.labs.envelope.EnvelopeMain",
                "--master=yarn", "--deploy-mode=cluster","--conf=spark.extraListeners=Mylistener","--conf","spark.eventLog.enabled=true", "--conf","spark.yarn.jars=hdfs://192.168.6.188:8020/user/hdfs/lib/*", jarpath, childArg};
        SparkSubmit.main(arg);
    }

 

Cloudera Employee
Posts: 32
Registered: ‎08-26-2015

Re: SparkListener in Spark on YARN-CLUSTER not works?

I would recommend using SparkLauncher to submit your Envelope application to the cluster. That has a more structured API for configuring the application, and when you submit it then it will return you a SparkAppHandle that has a method for retrieving the app ID.

New Contributor
Posts: 5
Registered: ‎08-28-2018

Re: SparkListener in Spark on YARN-CLUSTER not works?

i want to submit my envelope application to the cluster via YARN,now i use "org.apache.spark.deploy.yarn.client" to submit directly. If you have any good idea,please tell me.
tks.
New Contributor
Posts: 5
Registered: ‎08-28-2018

Re: SparkListener in Spark on YARN-CLUSTER not works?

i want to submit my envelope application to the cluster via YARN,now i use "org.apache.spark.deploy.yarn.client" to submit directly. If you have any good idea,please tell me.
tks.

  

public static void main(String[] s) throws Exception {
String[] args = new String[]{
"--jar", "build\\envelope-full\\target\\envelope-full-0.5.0.jar",
"--class", "com.cloudera.labs.envelope.EnvelopeMain",
"--arg", "hdfs://fj-c7-188.linewell.com:8020/user/hdfs/test.conf"
};
Configuration config = loadConfigFiles(HADOOP_SITE_FILES);
System.setProperty("HADOOP_USER_NAME", "hdfs");
System.setProperty("SPARK_YARN_MODE", "true");
System.setProperty("hdp.version", "2.6.1.0-129");
ClientArguments carg = new ClientArguments(args);
SparkConf sparkConf = new SparkConf();
sparkConf.set("spark.submit.deployMode", "cluster");
sparkConf.set("spark.driver.extraJavaOptions", "-Dhdp.version=2.6.1.0-129");
sparkConf.set("spark.executor.extraJavaOptions", "-Dhdp.version=2.6.1.0-129");
sparkConf.set("spark.yarn.jars","hdfs://192.168.6.188:8020/user/hdfs/lib/*");
sparkConf.set("spark.eventLog.enabled=","true");
Client client = new Client(carg,config,sparkConf);
System.out.println(client.submitApplication());
// new Client(carg, config, sparkConf).run();
}
Cloudera Employee
Posts: 32
Registered: ‎08-26-2015

Re: SparkListener in Spark on YARN-CLUSTER not works?

Same advice as my previous reply -- try out SparkLauncher.
New Contributor
Posts: 5
Registered: ‎08-28-2018

Re: SparkListener in Spark on YARN-CLUSTER not works?

I mean SparkLauncher does not support submitting to yarn,isn't it?
Cloudera Employee
Posts: 32
Registered: ‎08-26-2015

Re: SparkListener in Spark on YARN-CLUSTER not works?

It does! Just use the 'setMaster' and 'setDeployMode' methods like you would use '--master' and '--deploy-mode' on the command line.

Announcements

Currently incubating in Cloudera Labs:

Envelope
HTrace
Ibis
Impyla
Livy
Oryx
Phoenix
Spark Runner for Beam SDK
Time Series for Spark
YCSB