Support Questions

Find answers, ask questions, and share your expertise

Set environment variable in CDH for Spark executors

New Contributor



I have been trying to set environment variable in Spark.  However there seems to be problems.


I tried to use HDFS/YARN from CDH 5.12, and a standalone Spark (v2.2.0) and run together with Crail (  However, there is error in the YARN logs saying that Crail's library path is not included in java.library.path. 



17/11/27 10:57:50 INFO ibm.crail: passive

17/11/27 10:57:50 INFO ibm.disni: creating RdmaProvider of type 'nat'
Exception in thread "dag-scheduler-event-loop" java.lang.UnsatisfiedLinkError: no disni in java.library.path

at java.lang.ClassLoader.loadLibrary(
at java.lang.Runtime.loadLibrary0(
at java.lang.System.loadLibrary(




I found in a post from Crail's user group that it can be fixed by setting the following variable:



spark.executor.extraJavaOptions -Djava.library.path=/opt/crail/crail-1.0-bin/lib


Here is the post:!topic/zrlio-users/_P5NeH3iHxE


Can you please guide where I should set the environment variable inside CDH?


I tried to set the environment variable inside ~/.bashrc and  However, it didn't work, because it seems CDH will reset all enviroment variables when starting services. 


I also tried setting the environment variable in all the places I can find inside CDH, including the configuration of Environments in Cloudera Management Service, YARN, and HDFS.  But the problem is still not solved.







This is the order of precedence for configurations that Spark will use: - Properties set on SparkConf or SparkContext in code - Arguments passed to spark-submit, spark-shell, or pyspark at run time - Properties set in /etc/spark/conf/spark-defaults.conf, a specified properties file or in Cloudera Manager safety valve - Environment variables exported or set in scripts * For properties that apply to all jobs, use spark-defaults.conf, for properties that are constant and specific to a single or a few applications use SparkConf or --properties-file, for properties that change between runs use command line arguments.

View solution in original post


New Contributor

To simplify the question:  how can I set multiple Environment Variables under "yarn.nodemanager.admin-en"?


This is the order of precedence for configurations that Spark will use: - Properties set on SparkConf or SparkContext in code - Arguments passed to spark-submit, spark-shell, or pyspark at run time - Properties set in /etc/spark/conf/spark-defaults.conf, a specified properties file or in Cloudera Manager safety valve - Environment variables exported or set in scripts * For properties that apply to all jobs, use spark-defaults.conf, for properties that are constant and specific to a single or a few applications use SparkConf or --properties-file, for properties that change between runs use command line arguments.