Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

No FileSystem for scheme: hdfs

avatar
New Contributor

I'm getting this exception when trying to start my HBase master:

 

2016-01-26 08:08:21,235 INFO org.apache.hadoop.hbase.mob.MobFileCache: MobFileCache is initialized, and the cache size is 1000
2016-01-26 08:08:21,310 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster
	at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2046)
	at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:198)
	at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
	at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2060)
Caused by: java.io.IOException: No FileSystem for scheme: hdfs
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2623)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2637)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2680)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2662)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:379)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
	at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:1005)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:532)
	at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:347)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2041)
	... 5 more

 

What could be causing this issue?  I've tried adding a HDFS Gateway role to the host but that made no difference.

10 REPLIES 10

avatar
Expert Contributor

Hello Conor,

 

This error is few times occurred classpath to hadoop jars isn't correct. I would also request you to please verify if hbase.rootdir URL is fully qualified (i.e. hdfs://namenode.example.org:5959/hbase) & is correct.

 

Regards.

avatar
Mentor
Assuming you are running CDH via CM (given you talk of Gateways), this shouldn't ideally happen on a new setup.

I can think of a couple of reasons, but it depends on the mode of installation you are using.

If you are using parcels, ensure that no /usr/lib/hadoop* directories exist anymore on the machine. Their existence may otherwise confuse the classpath-automating scripts into not finding all the relevant jars required for the "hdfs://" scheme service discovery.

What are your outputs for the commands "hadoop classpath" and "ls -ld /opt/cloudera/parcels/CDH"?

avatar
Explorer

Hello Harsh,

 

I ran into the same problem as the OP. I found no /usr/lib/hadoop directories on the machine.

 

The output of hadoop classpath is

/etc/hadoop/conf:/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH/lib/hadoop/.//*:/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hadoop/libexec/../../hadoop-yarn/.//*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/.//*

 

The output of ls -ld /opt/cloudera/parcels/CDH is 

/opt/cloudera/parcels/CDH -> CDH-5.12.0-1.cdh5.12.0.p0.29

 

When running Spark jobs, I am able to solve this issue by adding the /opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/jars/hadoop-hdfs-2.6.0-cdh5.12.0.jar to --jars flag of spark-submit. Hence, I think for some reason the jar is not being loaded into the dependencies automatically by Cloudera Manager. Would you know of a fix for this?

avatar
Mentor
A few checks:

- Does the host where you invoke spark-submit carry a valid Spark Gateway role, with deployed configs under /etc/spark/conf/? There's also a classpath file under that location, which you may want to check to see if it includes all HDFS and YARN jars.
- Do you bundle any HDFS/YARN project jars in your Spark App jar (such as a fat-jar assembly)? You may want to check the version matches with what is on the cluster classpath.
- Are there any global environment variables (run 'env' to check) that end in or carry 'CLASSPATH' in their name? Try unsetting these and retrying.

avatar
Explorer

Hello Harsh,

 

Thanks for getting back to me. On the checks:

 

- The host is shown to be commisioned as a Spark Gateway in Cloudera Manager. Under /etc/spark/conf, I see the following files: 

docker.properties.template, log4j.properties.template, slaves.template, spark-defaults.conf.template, spark-env.sh.template, fairscheduler.xml.template, metrics.properties.template, spark-defaults.conf, spark-env.sh

 

Is there an explicit classpath file that I should see or are you referring to the SPARK_DIST_CLASSPATH variable that is set in spark-env.sh? Should I add the hadoop-hdfs-2.6.0-cdh5.12.0.jar to this classpath?

 

- I don't bundle any project jars in the Spark App. 

- There were no global environment variables using 'env' that ended in or carried 'CLASSPATH' in their name

 

avatar
Mentor
Thank you for the added info. I notice now that your 'hadoop classpath' oddly does not mention any hadoop-hdfs library paths.

Can you post an output of 'env' and the contents of your /etc/hadoop/conf/hadoop-env.sh file from the same host where the hadoop classpath output was generated?

CDH scripts auto-add /opt/cloudera/parcels/CDH/lib/hadoop-hdfs/ paths, unless some environment variables such as HADOOP_HDFS_HOME have been overriden to point to an invalid path. The requested output above is to help check that among other factors that influence the classpath building script.

avatar
Explorer

Hey Harsh, 

 

Here is the requested info:

 

env: 

XDG_SESSION_ID=6
SHELL=/bin/bash
TERM=xterm-256color
SSH_CLIENT=
SSH_TTY=/dev/pts/13
USER=user
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:
PATH=/home/user/.conda/envs/py27/bin:/opt/apache-maven-3.5.2/bin:/usr/anaconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
MAIL=/var/mail/user
CONDA_PATH_BACKUP=/opt/apache-maven-3.5.2/bin:/usr/anaconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
HADOOP_HDFS_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
CONDA_PREFIX=/home/user/.conda/envs/py27
PWD=/home/user/bitbucket/dl_staging
JAVA_HOME=/usr/lib/jvm/java-8-oracle/jre
LANG=en_US.UTF-8
PS1=(py27) \[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\u@\h:\w\$
HOME=/home/user
M2_HOME=/opt/apache-maven-3.5.2
SHLVL=1
CONDA_PS1_BACKUP=\[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\u@\h:\w\$
LOGNAME=user
SSH_CONNECTION=
CONDA_DEFAULT_ENV=py27
LESSOPEN=| /usr/bin/lesspipe %s
XDG_RUNTIME_DIR=/run/user/1000
LESSCLOSE=/usr/bin/lesspipe %s %s
_=/usr/bin/env

 

/etc/hadoop/conf/hadoop-env.sh:

# Prepend/Append plugin parcel classpaths

if [ "$HADOOP_USER_CLASSPATH_FIRST" = 'true' ]; then
# HADOOP_CLASSPATH={{HADOOP_CLASSPATH_APPEND}}
:
else
# HADOOP_CLASSPATH={{HADOOP_CLASSPATH}}
:
fi
# JAVA_LIBRARY_PATH={{JAVA_LIBRARY_PATH}}

export HADOOP_MAPRED_HOME=$( ([[ ! '/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce' =~ CDH_MR2_HOME ]] && echo /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce ) || echo ${CDH_MR2_HOME:-/usr/lib/hadoop-map$
export YARN_OPTS="-Xmx825955249 -Djava.net.preferIPv4Stack=true $YARN_OPTS"
export HADOOP_CLIENT_OPTS="-Djava.net.preferIPv4Stack=true $HADOOP_CLIENT_OPTS"

 

avatar
Mentor
Thank you,

Please try an 'unset HADOOP_HDFS_HOME' and retry your command(s), without
including the hadoop-hdfs jars this time. Does it succeed?

Can you figure out who/what is setting HADOOP_HDFS_HOME env-var in your
user session? This must not be set, as it is self-set to the correct path
by CDH scripts without manual intervention. You can check
.bashrc/.bash_profile to start with, perhaps.

avatar
Explorer

Hello Harsh,

 

The following command 'unset HADOOP_HDFS_HOME' did the trick! I am able to run spark-submit without including the hadoop-hdfs jar and also run the command  'hadoop fs -ls' on the local terminal to view the HDFS directories. 

 

The problem was in my /etc/environment file, which included the following line:

HADOOP_HDFS_HOME="/opt/cloudera/parcels/CDH/lib/hadoop"

 

I think I must have inserted the above line following some installation guide, but it was the cause of this issue. Removing that line from the /etc/environment file permanently fixes the issue. I can open a new terminal and run spark-submit without running 'unset HADOOP_HDFS_HOME' first. Thank you so much for helping me fix this!