- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark submit configuration --deploy-mode has been overrided from cluster to client in singel node minikube
- Labels:
-
Apache Spark
Created 06-17-2024 10:38 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are in the process of migrating from YARN to Kubernetes for its benefits and upgrading our Spark version from 2.4.4 to 3.5.1. As part of this transition, we have decided to use Scala version 2.12.18 and have upgraded Java from version 8 to 11. Currently, I am encountering three main issues:
- I am experiencing an ArithmeticException due to long overflow. Could the switch from Java 8 to 11 be causing this issue?
- The deployment mode specified as cluster in the spark-submit command is being overridden to client.
- I am unable to use AWS Hadoop package classes in spark-submit, despite including the jars in the container.
$SPARK_HOME/bin/spark-submit \
--master k8s://$K8S_SERVER \ \
--deploy-mode cluster \
--name testing \
--class dt.cerebrum.iotengine.sparkjobs.streaming \
--conf spark.kubernetes.file.upload.path=s3a://cb-spark/path \
--conf spark.hadoop.fs.s3a.endpoint="http://xxxxxxx.xxx" \
--conf spark.hadoop.fs.s3a.access.key="xxxx" \
--conf spark.hadoop.fs.s3a.secret.key="xxxxxxxxx" \
--conf spark.driver.extraJavaOptions=-Divy.cache.dir=/tmp -Divy.home=/tmp \
--conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
--conf spark.hadoop.fs.s3a.fast.upload=true \
--conf spark.hadoop.fs.s3a.path.style.access="true" \
s3a://cb-spark/iot_engine.jar
Any assistance you could provide on these issues would be greatly appreciated.
Thank you.
Created 07-09-2024 11:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Apache Spark 3.5.1 will support Java 8/11/17 and Scala Binary Version 2.12/2.13. If you want to use Scala Binary Version 2.12 then recommended Scala version is 2.12.18
Coming to your questions:
1. With out providing the Exception stack trace details difficult to provide a solution.
2. Reason could be in your application code while creating spark session maybe you have hard coded client mode.
3. To use AWS, you need to download hadoop-aws jars files and pass it in spark submit command.
References:
Created 07-15-2024 08:20 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @RangaReddy ,
Exception stack trace:
Currently we are running our spark jobs on yarn using same code and we never get his issue. Could it be caused by lack of memory.
2. We didn't hard code the clientmode any where. I was working fine in yarn not with Kubernetes.
3. we have tried by providing the following but it didn't work. And we also downloaded these jars and placed in the jars folder. But no Luck.
--packages org.apache.hadoop:hadoop-aws:3.3.4 \
--packages com.amazonaws:aws-java-sdk-bundle:1.12.262 \
--packages org.apache.spark:spark-hadoop-cloud_2.12:3.5.1 \
--packages org.apache.hadoop:hadoop-client-api:3.3.4 \
--packages org.apache.hadoop:hadoop-client-runtime:3.3.4 \
Created 07-11-2024 05:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @saifikhan
Just by providing ArithmeticException, we cant provide any solution. This can be occurred due to your code or apache spark code. Check the exception stack-trace and fix the issue if issue is from your code.