Support Questions

Find answers, ask questions, and share your expertise

Apache Spark Submit using rest API driver state ERROR

Super Collaborator

Hi ,

 

I have a spark cluster deployed on windows. I'm trying to submit a simple spark job using the rest api. The job is just python code that does simple hello world sentence as follows :

from pyspark.sql import SparkSession
def main(args):
    print('hello world')
    return 0
if __name__ == '__main__':
  main(None)

The url Im using to submit the job is:

http://<Master-IP>:6066/v1/submissions/create

With the following Post Body:

 

{
    "appResource": "file:../../helloworld.py",
    "sparkProperties": {
        "spark.executor.memory": "2g",
        "spark.master": "spark://<Master IP>:7077",
        "spark.app.name": "Spark REST API - Hello world",
        "spark.driver.memory": "2g",
        "spark.eventLog.enabled": "false",
        "spark.driver.cores": "2",
        "spark.submit.deployMode": "cluster",
        "spark.driver.supervise": "true"
    },
    "clientSparkVersion": "3.3.1",
    "mainClass": "org.apache.spark.deploy.SparkSubmit",
    "environmentVariables": {
        "SPARK_ENV_LOADED": "1"
    },
    "action": "CreateSubmissionRequest",
    "appArgs": [
        "../../helloworld.py", "80"
    ]
}

After I run this post using postmant, I get the following response:

{
    "action": "CreateSubmissionResponse",
    "message": "Driver successfully submitted as driver-20221216112633-0005",
    "serverSparkVersion": "3.3.1",
    "submissionId": "driver-20221216112633-0005",
    "success": true
}

However when I try to get the job status using :

http://<Master-IP>:6066/v1/submissions/status/driver-20221216112633-0005

I get the driverState: ERROR , NullPointerException as follows:

{
    "action": "SubmissionStatusResponse",
    "driverState": "ERROR",
    "message": "Exception from the cluster:\njava.lang.NullPointerException\n\torg.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:158)\n\torg.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:179)\n\torg.apache.spark.deploy.worker.DriverRunner$$anon$2.run(DriverRunner.scala:99)",
    "serverSparkVersion": "3.3.1",
    "submissionId": "driver-20221216112633-0005",
    "success": true,
    "workerHostPort": "10.9.8.120:56060",
    "workerId": "worker-20221216093629-<IP>-56060"
}

 

Not sure why Im getting this error and what it means. Can someone please point me in the right direction or help me at least  how I can trouble this farther? Thanks

 

 

 

1 ACCEPTED SOLUTION

Super Collaborator

I was finally able to figure out the problem. To resolve this issue basically it seems like the py\jar file as specified in the "appResource" & ""spark.jars" needs to be accessible by all nodes in the cluster, for example if you have network path you can specify the network path in both attributes as follows:
"appResource": "file:////Servername/somefolder/HelloWorld.jar",
...
"spark.jars": "file:////Servername/someFolder/HelloWorld.jar",

 

Note sure why if the job is being submitted to the master. If anybody knows please help me understand.

View solution in original post

1 REPLY 1

Super Collaborator

I was finally able to figure out the problem. To resolve this issue basically it seems like the py\jar file as specified in the "appResource" & ""spark.jars" needs to be accessible by all nodes in the cluster, for example if you have network path you can specify the network path in both attributes as follows:
"appResource": "file:////Servername/somefolder/HelloWorld.jar",
...
"spark.jars": "file:////Servername/someFolder/HelloWorld.jar",

 

Note sure why if the job is being submitted to the master. If anybody knows please help me understand.