Member since
07-28-2015
3
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5059 | 08-03-2015 07:40 AM |
08-03-2015
07:40 AM
Hi, I figured it out and it was a transitive inhouse dependency which defined these FileSystem impl classes in its META-INF/services/org.apache.hadoop.fs.FileSystem. So sorry for the topic ;-). Lars
... View more
08-03-2015
06:07 AM
I still could not fix the issue but i digged a bit more in the code. So ServiceProvider loads all implementations of a given class which would be org.apache.hadoop.fs.FileSystem - to know which FileSystem classes to load it looks in a file located at META-INF/services/org.apache.hadoop.fs.FileSystem in each referenced jar. hadoop-client 2.6.0-mr1-cdh5.4.4 includes the following hadoop jars which provide a file META-INF/services/org.apache.hadoop.fs.FileSystem: 1. hadoop-common-2.6.0-cdh5.4.4.jar org.apache.hadoop.fs.LocalFileSystem org.apache.hadoop.fs.viewfs.ViewFileSystem org.apache.hadoop.fs.ftp.FTPFileSystem org.apache.hadoop.fs.HarFileSystem 2. hadoop-hdfs-2.6.0-cdh5.4.4.jar org.apache.hadoop.hdfs.DistributedFileSystem org.apache.hadoop.hdfs.web.HftpFileSystem org.apache.hadoop.hdfs.web.HsftpFileSystem org.apache.hadoop.hdfs.web.WebHdfsFileSystem org.apache.hadoop.hdfs.web.SWebHdfsFileSystem I cant find more services config files in the context, but still FileSystem.get() tries to load org.apache.hadoop.fs.s3.S3FileSystem. Why is this happening? Where does this information comes from. If this is a general issue everyone using cloudera 5.4.x client jars should have this problem which i somehow can not image. So Im still thinking there is a general error in my setup - I just dont know which one. Regards, Lars
... View more
07-28-2015
06:00 AM
Hi, we recently upgraded our cloudera cluster from 4.5.x to 5.4.3. I upgraded the clients as well and followed the maven client documentation of cloudera 5. I just included the hadoop-client.jar in version 2.6.0-mr1-cdh5.4.3. Everything worked fine except the calls to org.apache.hadoop.fs.FileSystem.get() which fail with an exception that FileSystem Providers like S3FileSystem, KosmosFileSystem etc are not available. I noticed that these classes were included in the hadoop-client 2.0.0-mr1-cdh4.5.0 (hadoop-common) but not longer in the 5.x.x versions. Some moved to different jars (S3FileSystem to hadoop-aws) other like KosmosFileSystem i could not find at all. If I mix 4.5.x and 5.4.x jars its starting to get messy (what a suprise :-)) In general i am wondering why i have to provide these classes if i dont use them. I looked in the hadoop config files (core-site, core-default, hdfs-site) for a property which FileSystems providers to load, but was not able to find one. So in the end I have 2 questions. 1. Do I really have to provide all FileSystem Provider classes and if so how to find out where for example KosmosFileSystem is located ? 2. How to tell FileSystem class which providers to support? In general what is the best practice to have a client which uses FileSystem.get() to access the required file system? Im looking forward for any help. Regards, Lars
... View more
Labels:
- Labels:
-
Apache Hadoop
-
HDFS