Support Questions

Find answers, ask questions, and share your expertise

Spark submit configuration --deploy-mode has been overrided from cluster to client in singel node minikube

avatar
Explorer
Hi ,

We are in the process of migrating from YARN to Kubernetes for its benefits and upgrading our Spark version from 2.4.4 to 3.5.1. As part of this transition, we have decided to use Scala version 2.12.18 and have upgraded Java from version 8 to 11. Currently, I am encountering three main issues:

  1. I am experiencing an ArithmeticException due to long overflow. Could the switch from Java 8 to 11 be causing this issue?
  2. The deployment mode specified as cluster in the spark-submit command is being overridden to client.
  3. I am unable to use AWS Hadoop package classes in spark-submit, despite including the jars in the container.

    $SPARK_HOME/bin/spark-submit \
    --master k8s://$K8S_SERVER \ \
    --deploy-mode cluster \
    --name testing \
    --class dt.cerebrum.iotengine.sparkjobs.streaming \
    --conf spark.kubernetes.file.upload.path=s3a://cb-spark/path \
    --conf spark.hadoop.fs.s3a.endpoint="http://xxxxxxx.xxx" \
    --conf spark.hadoop.fs.s3a.access.key="xxxx" \
    --conf spark.hadoop.fs.s3a.secret.key="xxxxxxxxx" \
    --conf spark.driver.extraJavaOptions=-Divy.cache.dir=/tmp -Divy.home=/tmp \
    --conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
    --conf spark.hadoop.fs.s3a.fast.upload=true \
    --conf spark.hadoop.fs.s3a.path.style.access="true" \
    s3a://cb-spark/iot_engine.jar

Any assistance you could provide on these issues would be greatly appreciated.

Thank you.

3 REPLIES 3

avatar
Master Collaborator

Apache Spark 3.5.1 will support Java 8/11/17 and Scala Binary Version 2.12/2.13. If you want to use Scala Binary Version 2.12 then recommended Scala version is 2.12.18

Coming to your questions:

1. With out providing the Exception stack trace details difficult to provide a solution.

2. Reason could be in your application code while creating spark session maybe you have hard coded client mode.

3. To use AWS, you need to download hadoop-aws jars files and pass it in spark submit command.

 

References:

1. https://spark.apache.org/docs/3.5.1/index.html

2. https://github.com/apache/spark/tree/v3.5.1

avatar
Explorer

Hi @RangaReddy ,

Exception stack trace:
Screenshot 2024-06-10 at 2.12.31 PM (1).png

Currently we are running our spark jobs on yarn using same code and we never get his issue. Could it be caused by lack of memory.

2. We didn't hard code the clientmode any where. I was working fine in yarn not with Kubernetes.

3. we have tried by providing the following but it didn't work. And we also downloaded these jars and placed in the jars folder. But no Luck.
--packages org.apache.hadoop:hadoop-aws:3.3.4 \
--packages com.amazonaws:aws-java-sdk-bundle:1.12.262 \
--packages org.apache.spark:spark-hadoop-cloud_2.12:3.5.1 \
--packages org.apache.hadoop:hadoop-client-api:3.3.4 \
--packages org.apache.hadoop:hadoop-client-runtime:3.3.4 \

 

avatar
Master Collaborator

Hi @saifikhan 

Just by providing ArithmeticException, we cant provide any solution. This can be occurred due to your code or apache spark code. Check the exception stack-trace and fix the issue if issue is from your code.