Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar
Master Collaborator

In this article, we will learn to pass atlas-application.properties configuration file from a different location in spark-submit command.

When Atlas service is enabled in CDP, and we run Spark application by default, atlas-application.properties file is picked from /etc/spark/conf.cloudera.spark_on_yarn/ directory.

Let's test with SparkPi example:

 

spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client  /opt/cloudera/parcels/CDH/jars/spark-examples*.jar 10

 

We can see the following output in the application log.

 

21/08/23 06:12:03 INFO atlas.ApplicationProperties: Looking for atlas-application.properties in classpath
21/08/23 06:12:03 INFO atlas.ApplicationProperties: Loading atlas-application.properties from file:/etc/spark/conf.cloudera.spark_on_yarn/atlas-application.properties

 

If we want to pass the atlas-application.properties configuration file from a different location, for example /tmp directory, copy the atlas-application.properties from /etc/spark/conf.cloudera.spark_on_yarn to /tmp directory and pass it using -Datlas.conf=/tmp/ variable in spark-submit.

 

Let's test with same SparkPi example by adding --driver-java-options="-Datlas.conf=/tmp/" property to the spark-submit.

 

spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --driver-java-options="-Datlas.conf=/tmp/" /opt/cloudera/parcels/CDH/jars/spark-examples*.jar 10

 

We can see the following output in the application log.

 

21/08/05 14:36:24 INFO atlas.ApplicationProperties: Looking for atlas-application.properties in classpath
21/08/05 14:36:24 INFO atlas.ApplicationProperties: Loading atlas-application.properties from file:/tmp/atlas-application.properties

 

In order to run the same SparkPi example in cluster mode, we need to place the atlas-application.properties file in all nodes /tmp directory and run the Spark application as follows:

 

spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster \
--files /tmp/atlas-application.properties#atlas-application.properties --driver-java-options="-Datlas.conf=/tmp/" \
/opt/cloudera/parcels/CDH/jars/spark-examples*.jar 10

 

or, 

 

sudo -u spark spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster \
--files /tmp/atlas-application.properties --conf spark.driver.extraJavaOptions="-Datlas.conf=./" \
/opt/cloudera/parcels/CDH/jars/spark-examples*.jar 10

 

We can see the following output:

 

21/08/23 06:12:07 INFO atlas.ApplicationProperties: Loading atlas-application.properties from file:/data1/tmp/usercache/spark/appcache/application_1629693759177_0016/container_e74_1629693759177_0016_01_000001/./atlas-application.properties

 

2,081 Views