Support Questions

Conor · ‎01-26-2016

I'm getting this exception when trying to start my HBase master:

2016-01-26 08:08:21,235 INFO org.apache.hadoop.hbase.mob.MobFileCache: MobFileCache is initialized, and the cache size is 1000
2016-01-26 08:08:21,310 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster
	at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2046)
	at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:198)
	at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
	at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2060)
Caused by: java.io.IOException: No FileSystem for scheme: hdfs
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2623)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2637)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2680)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2662)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:379)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
	at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:1005)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:532)
	at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:347)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2041)
	... 5 more

What could be causing this issue? I've tried adding a HDFS Gateway role to the host but that made no difference.

Consult · ‎01-27-2016

Hello Conor,

This error is few times occurred classpath to hadoop jars isn't correct. I would also request you to please verify if hbase.rootdir URL is fully qualified (i.e. hdfs://namenode.example.org:5959/hbase) & is correct.

Regards.

Harsh J · ‎01-28-2016

Assuming you are running CDH via CM (given you talk of Gateways), this shouldn't ideally happen on a new setup.

I can think of a couple of reasons, but it depends on the mode of installation you are using.

If you are using parcels, ensure that no /usr/lib/hadoop* directories exist anymore on the machine. Their existence may otherwise confuse the classpath-automating scripts into not finding all the relevant jars required for the "hdfs://" scheme service discovery.

What are your outputs for the commands "hadoop classpath" and "ls -ld /opt/cloudera/parcels/CDH"?

Stacks · ‎03-13-2018

Hello Harsh,

I ran into the same problem as the OP. I found no /usr/lib/hadoop directories on the machine.

The output of hadoop classpath is

/etc/hadoop/conf:/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH/lib/hadoop/.//*:/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hadoop/libexec/../../hadoop-yarn/.//*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/.//*

The output of ls -ld /opt/cloudera/parcels/CDH is

/opt/cloudera/parcels/CDH -> CDH-5.12.0-1.cdh5.12.0.p0.29

When running Spark jobs, I am able to solve this issue by adding the /opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/jars/hadoop-hdfs-2.6.0-cdh5.12.0.jar to --jars flag of spark-submit. Hence, I think for some reason the jar is not being loaded into the dependencies automatically by Cloudera Manager. Would you know of a fix for this?

Harsh J · ‎03-15-2018

A few checks:

- Does the host where you invoke spark-submit carry a valid Spark Gateway role, with deployed configs under /etc/spark/conf/? There's also a classpath file under that location, which you may want to check to see if it includes all HDFS and YARN jars.
- Do you bundle any HDFS/YARN project jars in your Spark App jar (such as a fat-jar assembly)? You may want to check the version matches with what is on the cluster classpath.
- Are there any global environment variables (run 'env' to check) that end in or carry 'CLASSPATH' in their name? Try unsetting these and retrying.

Stacks · ‎03-19-2018

Hello Harsh,

Thanks for getting back to me. On the checks:

- The host is shown to be commisioned as a Spark Gateway in Cloudera Manager. Under /etc/spark/conf, I see the following files:

docker.properties.template, log4j.properties.template, slaves.template, spark-defaults.conf.template, spark-env.sh.template, fairscheduler.xml.template, metrics.properties.template, spark-defaults.conf, spark-env.sh

Is there an explicit classpath file that I should see or are you referring to the SPARK_DIST_CLASSPATH variable that is set in spark-env.sh? Should I add the hadoop-hdfs-2.6.0-cdh5.12.0.jar to this classpath?

- I don't bundle any project jars in the Spark App.

- There were no global environment variables using 'env' that ended in or carried 'CLASSPATH' in their name

Harsh J · ‎03-20-2018

Thank you for the added info. I notice now that your 'hadoop classpath' oddly does not mention any hadoop-hdfs library paths.

Can you post an output of 'env' and the contents of your /etc/hadoop/conf/hadoop-env.sh file from the same host where the hadoop classpath output was generated?

CDH scripts auto-add /opt/cloudera/parcels/CDH/lib/hadoop-hdfs/ paths, unless some environment variables such as HADOOP_HDFS_HOME have been overriden to point to an invalid path. The requested output above is to help check that among other factors that influence the classpath building script.

Stacks · ‎03-21-2018

Hey Harsh,

Here is the requested info:

env:

XDG_SESSION_ID=6
SHELL=/bin/bash
TERM=xterm-256color
SSH_CLIENT=
SSH_TTY=/dev/pts/13
USER=user
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:
PATH=/home/user/.conda/envs/py27/bin:/opt/apache-maven-3.5.2/bin:/usr/anaconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
MAIL=/var/mail/user
CONDA_PATH_BACKUP=/opt/apache-maven-3.5.2/bin:/usr/anaconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
HADOOP_HDFS_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
CONDA_PREFIX=/home/user/.conda/envs/py27
PWD=/home/user/bitbucket/dl_staging
JAVA_HOME=/usr/lib/jvm/java-8-oracle/jre
LANG=en_US.UTF-8
PS1=(py27) \[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\u@\h:\w\$
HOME=/home/user
M2_HOME=/opt/apache-maven-3.5.2
SHLVL=1
CONDA_PS1_BACKUP=\[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\u@\h:\w\$
LOGNAME=user
SSH_CONNECTION=
CONDA_DEFAULT_ENV=py27
LESSOPEN=| /usr/bin/lesspipe %s
XDG_RUNTIME_DIR=/run/user/1000
LESSCLOSE=/usr/bin/lesspipe %s %s
_=/usr/bin/env

/etc/hadoop/conf/hadoop-env.sh:

# Prepend/Append plugin parcel classpaths

if [ "$HADOOP_USER_CLASSPATH_FIRST" = 'true' ]; then
# HADOOP_CLASSPATH={{HADOOP_CLASSPATH_APPEND}}
:
else
# HADOOP_CLASSPATH={{HADOOP_CLASSPATH}}
:
fi
# JAVA_LIBRARY_PATH={{JAVA_LIBRARY_PATH}}

export HADOOP_MAPRED_HOME=$( ([[ ! '/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce' =~ CDH_MR2_HOME ]] && echo /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce ) || echo ${CDH_MR2_HOME:-/usr/lib/hadoop-map$
export YARN_OPTS="-Xmx825955249 -Djava.net.preferIPv4Stack=true $YARN_OPTS"
export HADOOP_CLIENT_OPTS="-Djava.net.preferIPv4Stack=true $HADOOP_CLIENT_OPTS"

Harsh J · ‎03-22-2018

Thank you,

Please try an 'unset HADOOP_HDFS_HOME' and retry your command(s), without
including the hadoop-hdfs jars this time. Does it succeed?

Can you figure out who/what is setting HADOOP_HDFS_HOME env-var in your
user session? This must not be set, as it is self-set to the correct path
by CDH scripts without manual intervention. You can check
.bashrc/.bash_profile to start with, perhaps.

Stacks · ‎03-23-2018

Hello Harsh,

The following command 'unset HADOOP_HDFS_HOME' did the trick! I am able to run spark-submit without including the hadoop-hdfs jar and also run the command 'hadoop fs -ls' on the local terminal to view the HDFS directories.

The problem was in my /etc/environment file, which included the following line:

HADOOP_HDFS_HOME="/opt/cloudera/parcels/CDH/lib/hadoop"

I think I must have inserted the above line following some installation guide, but it was the cause of this issue. Removing that line from the /etc/environment file permanently fixes the issue. I can open a new terminal and run spark-submit without running 'unset HADOOP_HDFS_HOME' first. Thank you so much for helping me fix this!