Created 01-23-2024 10:50 AM
hi cloudera,
I need to use Spark on a host that is not part of the Cloudera cluster to run Spark jobs on the Cloudera cluster.
Is it possible to use it this way? If yes, how to configure?
what I've already tried:
1. Download "https://www.apache.org/dyn/closer.lua/spark/spark-3.3.4/spark-3.3.4-bin-hadoop3.tgz"
2. Copy the "conf" files from the Cloudera cluster and send them to the new Spark directory
3. exported the variables "HADOOP_CONF_DIR" and "SPARK_CONF_DIR" and "SPARK_HOME" using the new spark directory "spark-3.3.4-bin-hadoop3" with the files
4. When trying to run spark-shell as an example, nothing happens, it hangs as shown below:
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.3.4
/_/
Using Scala version 2.13.8 (Java HotSpot(TM) 64-Bit Server VM, Java 11.0.16.1)
Type in expressions to have them evaluated.
Type :help for more information.
note: the cluster has kerberos, so before running spark-shell, kinit was run
Created 02-02-2024 04:40 AM
Unfortunately, as I didn't receive feedback from the community to give me guidance, I had to rack my brains a lot, hours and hours of testing, but I managed to do what I wanted.
I downloaded Spark in the same version as cdh 6.3.4, I configured the spark configuration files with the information from cdh 6.3.4, so when calling "spark-submit" the job is executed in the cdh cluster
Created 02-04-2024 07:44 AM
Unfortunately Cloudera will not support installing/using the open source Spark because of some customisations needs to be done at Cloudera end support other component integrations.
Created 02-02-2024 04:40 AM
Unfortunately, as I didn't receive feedback from the community to give me guidance, I had to rack my brains a lot, hours and hours of testing, but I managed to do what I wanted.
I downloaded Spark in the same version as cdh 6.3.4, I configured the spark configuration files with the information from cdh 6.3.4, so when calling "spark-submit" the job is executed in the cdh cluster
Created 02-04-2024 07:44 AM
Unfortunately Cloudera will not support installing/using the open source Spark because of some customisations needs to be done at Cloudera end support other component integrations.