Created 12-16-2022 09:35 AM
Hi ,
I have a spark cluster deployed on windows. I'm trying to submit a simple spark job using the rest api. The job is just python code that does simple hello world sentence as follows :
from pyspark.sql import SparkSession
def main(args):
print('hello world')
return 0
if __name__ == '__main__':
main(None)
The url Im using to submit the job is:
http://<Master-IP>:6066/v1/submissions/create
With the following Post Body:
{
"appResource": "file:../../helloworld.py",
"sparkProperties": {
"spark.executor.memory": "2g",
"spark.master": "spark://<Master IP>:7077",
"spark.app.name": "Spark REST API - Hello world",
"spark.driver.memory": "2g",
"spark.eventLog.enabled": "false",
"spark.driver.cores": "2",
"spark.submit.deployMode": "cluster",
"spark.driver.supervise": "true"
},
"clientSparkVersion": "3.3.1",
"mainClass": "org.apache.spark.deploy.SparkSubmit",
"environmentVariables": {
"SPARK_ENV_LOADED": "1"
},
"action": "CreateSubmissionRequest",
"appArgs": [
"../../helloworld.py", "80"
]
}
After I run this post using postmant, I get the following response:
{
"action": "CreateSubmissionResponse",
"message": "Driver successfully submitted as driver-20221216112633-0005",
"serverSparkVersion": "3.3.1",
"submissionId": "driver-20221216112633-0005",
"success": true
}
However when I try to get the job status using :
http://<Master-IP>:6066/v1/submissions/status/driver-20221216112633-0005
I get the driverState: ERROR , NullPointerException as follows:
{
"action": "SubmissionStatusResponse",
"driverState": "ERROR",
"message": "Exception from the cluster:\njava.lang.NullPointerException\n\torg.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:158)\n\torg.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:179)\n\torg.apache.spark.deploy.worker.DriverRunner$$anon$2.run(DriverRunner.scala:99)",
"serverSparkVersion": "3.3.1",
"submissionId": "driver-20221216112633-0005",
"success": true,
"workerHostPort": "10.9.8.120:56060",
"workerId": "worker-20221216093629-<IP>-56060"
}
Not sure why Im getting this error and what it means. Can someone please point me in the right direction or help me at least how I can trouble this farther? Thanks
Created 01-13-2023 06:53 AM
I was finally able to figure out the problem. To resolve this issue basically it seems like the py\jar file as specified in the "appResource" & ""spark.jars" needs to be accessible by all nodes in the cluster, for example if you have network path you can specify the network path in both attributes as follows:
"appResource": "file:////Servername/somefolder/HelloWorld.jar",
...
"spark.jars": "file:////Servername/someFolder/HelloWorld.jar",
Note sure why if the job is being submitted to the master. If anybody knows please help me understand.
Created 01-13-2023 06:53 AM
I was finally able to figure out the problem. To resolve this issue basically it seems like the py\jar file as specified in the "appResource" & ""spark.jars" needs to be accessible by all nodes in the cluster, for example if you have network path you can specify the network path in both attributes as follows:
"appResource": "file:////Servername/somefolder/HelloWorld.jar",
...
"spark.jars": "file:////Servername/someFolder/HelloWorld.jar",
Note sure why if the job is being submitted to the master. If anybody knows please help me understand.