Support Questions

Find answers, ask questions, and share your expertise

Configuration objects fail to initialize in HDP 2.2.4.8+ because /etc/hadoop/conf in no longer in CLASSPATH

avatar
New Contributor

In order to facilitate rolling upgrades, /etc/hadoop/conf is not part of the CLASSPATH env var that is constructed for a job. However when jobs are using a Configuration object: Configuration conf2 = new Configuration(); https://hadoop.apache.org/docs/current/api/org/apache/hadoop/conf/Configuration.html It tries to read the core-site.xml from the CLASSPATH

How can this be addressed?

1 ACCEPTED SOLUTION

avatar

SYMPTOM: The MR classpath generated in HDP 2.2.0+ no longer includes the Hadoop config files that were present in previous versions. Signs of this include an inability to read core-site.xml and find properties such as fs.defaultFS, after upgrading from HDP 2.1.x or earlier. Error messages in stdout may look similar to:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/service/CompositeService

ROOT CAUSE: As of HDP 2.2.0, the classpath is handled differently, in that MapReduce does not include the cluster classpath. This is intended to isolate a running MapReduce application from jar version changes during a rolling upgrade scenario. WORKAROUND: Potential workarounds include:
  1. Change code to add the required classpath elements explicitly.
  2. Change code to exec the hadoop classpath shell command to determine the cluster classpath at runtime.
  3. If code cannot be changed, reconfigure mapreduce.application.classpath in mapred-site.xml so that it does include the cluster classpath. However, this can compromise classpath isolation for running MapReduce jobs during a future rolling upgrade, and is therefore not recommended.
Example: Launch the application using a shell script that includes
export CLASSPATH=$CLASSPATH:`hadoop classpath`
RESOLUTION: Working as designed. Use one of the above workarounds if necessary.

View solution in original post

1 REPLY 1

avatar

SYMPTOM: The MR classpath generated in HDP 2.2.0+ no longer includes the Hadoop config files that were present in previous versions. Signs of this include an inability to read core-site.xml and find properties such as fs.defaultFS, after upgrading from HDP 2.1.x or earlier. Error messages in stdout may look similar to:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/service/CompositeService

ROOT CAUSE: As of HDP 2.2.0, the classpath is handled differently, in that MapReduce does not include the cluster classpath. This is intended to isolate a running MapReduce application from jar version changes during a rolling upgrade scenario. WORKAROUND: Potential workarounds include:
  1. Change code to add the required classpath elements explicitly.
  2. Change code to exec the hadoop classpath shell command to determine the cluster classpath at runtime.
  3. If code cannot be changed, reconfigure mapreduce.application.classpath in mapred-site.xml so that it does include the cluster classpath. However, this can compromise classpath isolation for running MapReduce jobs during a future rolling upgrade, and is therefore not recommended.
Example: Launch the application using a shell script that includes
export CLASSPATH=$CLASSPATH:`hadoop classpath`
RESOLUTION: Working as designed. Use one of the above workarounds if necessary.