Created 10-20-2015 04:41 PM
In some situations, for example when using a profiling tool, the fact that hadoop-env is called more than once is a problem, because it starts multiple instances of the profiler (in the way of HADOOP-9873, HADOOP-9902 & HADOOP-11010).
Is there a way to avoid the env variables in hadoop-env being called more than once?
Created 10-21-2015 11:46 AM
One thing I did in the past to avoid that script to be loaded multiple times was editing that script and add a kind of global variable to detect if the script needs to be executed again.
Something like that (put that at the beginning of the script, just after the first line #!/bin/bash):
[ "x$SCRIPT_HADOOP_ENV_LOADED" = "x1" ] && return
export SCRIPT_HADOOP_ENV_LOADED=1
Created 10-21-2015 11:29 AM
Hi @dprichici@hortonworks.com , The solution is in your question.
Are you looking for something in particluar?
diff --git a/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh b/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh index ecab38b..d36b370 100644 --- a/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh +++ b/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh @@ -23,6 +23,9 @@ # set JAVA_HOME in this file, so that it is correctly defined on # remote nodes. +[ X"$HADOOP_ENV_INITED" = X"true" ] && return +export HADOOP_ENV_INITED="true" + # The java implementation to use. export JAVA_HOME=${JAVA_HOME}
Created 10-21-2015 11:46 AM
One thing I did in the past to avoid that script to be loaded multiple times was editing that script and add a kind of global variable to detect if the script needs to be executed again.
Something like that (put that at the beginning of the script, just after the first line #!/bin/bash):
[ "x$SCRIPT_HADOOP_ENV_LOADED" = "x1" ] && return
export SCRIPT_HADOOP_ENV_LOADED=1
Created 10-22-2015 05:50 PM
Wonderful, exactly what I was looking for - thank you @Sourygna Luangsay & @Neeraj Sabharwal.