Member since
08-14-2020
1
Post
0
Kudos Received
0
Solutions
12-03-2020
12:55 PM
2 Kudos
Experimental:
Apache Hadoop Distributed File System (HDFS) is a popular file system in big data world. The HDFS federation provided the flexibility to configure multiple namenodes and improves the overall scalability of the cluster. The ViewFileSystem was introduced to access the federated HDFS cluster in a unified way. However, most of the existing HDFS users did not migrate to it because of the following reasons. It is a new scheme and users have to update their path URIs (for example, Hive stores fully qualified path URIs in its metastore). Also, the mount-table configurations need to be copied to all client nodes. To solve these problems, we enhanced ViewFS and introduced the ViewDistributedFileSystem. Now, users can initialize the filesystem with scheme "hdfs" and configure the mount points. We also introduced central mount-table reading functionality. Users can upload the mount-table configurations into Hadoop compatible file system (for example, HDFS) and configure the path in the core-site.xml file.
Enable the mounting functionality with HDFS
The new file system implementation class ViewDistributedFileSystem is an extension to the DistributedFileSystem class and is backed by ViewFileSystemOverloadScheme to bring the client-side mounting functionality. To enable it, add the following configuration in core-site.xml.
<property>
<name>fs.hdfs.impl</name>
<value>org.apache.hadoop.hdfs.ViewDistributedFileSystem</value>
</property>
Adding required mount links will be discussed in the following section.
Note: if we don’t add any mount links, it will just work the same as DistributedFileSystem class.
Mount Point Configurations
Adding mount point configurations is very easy. The mount point configuration pattern is the same as in ViewFs world. If you are familiar with ViewFs mount point configurations, you can copy the same configurations and update the paths as you need.
The mount point configurations can be added into a separate XML file (example: mount-table.1.xml) and keep the file in HDFS (or in any Hadoop compatible file system). Then, configure the path in core-site.xml.
<property>
<name>fs.viewfs.mounttable.path</name>
<value>hdfs://cluster/config/mount-table-dir/mount-table.<versionNumber>.xml
</value>
</property>
Note: versionNumber is an integer number and users should increment when making any changes to mount tables. So that ViewFs will load the latest version file when new clients are initialized.
Federated HDFS Cluster: Alternatively, users can keep the mount table configurations in core-site.xml itself and make sure all client nodes have the same core-site.xml configurations.
In the HDFS federation, we can have multiple namespaces and share the same set of datanodes. Since it has multiple namespaces, ViewFileSystem is the way to access these clusters in a unified way. However, due to the above-said issues, the adaption was limited. Now with the ViewDistributedFileSystem (backed by ViewFSOverloadScheme), the issues have been resolved and supported to use the same HDFS uri paths.
<property>
<name>fs.viewfs.mounttable.ns1.link./nn1</name>
<value>hdfs://nn1/</value>
</property>
<property>
<name>fs.viewfs.mounttable.ns1.link./nn2</name>
<value>hdfs://nn2</value>
</property>
<property>
<name>fs.viewfs.mounttable.ns1.link./nn2</name>
<value>hdfs://nn3</value>
</property>
Fig: 1
We can treat each namespace as a different volume and configure them as mount points. When the path starts with hdfs://ns1/nn1/, it will go to hdfs://nn1 cluster. Similarly, when path starts with hdfs://ns1/nn2/, it will go to hdfs://nn2 and a similar thing happens for the nn3 as well. One wants to use existing paths as default paths, they can configure linkFallback as discussed later in the section. Even though the original motivation was HDFS Federation, the ViewFileSystem design is considered to support all Hadoop compatible file systems.
Cloud Storage Environment
The following example configurations show how to add mount links to a virtual/logical hdfs uri (configured in fs.defaultFS=hdfs://ns1) with other object store clusters (s3a://bucketX/, s3a://bucketY/, s3a://bucketA).
<property>
<name>fs.viewfs.mounttable.ns1.link./data/marketing</name>
<value>s3a://bucketX/data/marketing</value>
</property>
<property>
<name>fs.viewfs.mounttable.ns1.link./data/sales</name>
<value>s3a://bucketY/data/sales</value>
</property>
<property>
<name>fs.viewfs.mounttable.ns1.link./backup</name>
<value>s3a://bucketA/backup/</value>
</property>
When the user accesses hdfs://ns1/data/marketing, the call will be delegated to the s3a://bucketX/data/marketing. Similar to that, when the user calls hdfs://ns1/data/sales, hdfs://ns1/backup, the call will be delegated to s3a://bucketY/data/sales , s3a://bucketA/backup/ respectively. The hostname in the URI (hdfs://ns1/...) should be the mount-table name in the mount link configurations. If a user wants to use the cluster URI the same as one of his existing clusters URI, then they can configure one of your mount points pointing to the same cluster.
However, please note that with these configurations, the user's view at root would be only mount points. That means, if a user does ‘ls’ on root, he can see only mount link paths.
[root@umag-1 ~]# sudo -u hdfs hdfs dfs -ls /
Found 2 items
drwxrwxrwx - hdfs hdfs 0 2020-07-24 23:22 /data
drwxrwxrwx - hdfs hdfs 0 2020-07-24 23:22 /backup
Fig: 2
Let’s consider the following operations to understand where these operations will be delegated based on mount links.
As shown in Fig:2, if users does not want to use HDFS scheme paths to initialize the file system, they can use any other scheme. Let's say user wants to use s3a scheme to initialize viewFS and the URI is s3a://bucketAll/; then please make sure to configure:
fs.s3a.impl=org.apache.hadoop.fs.ViewFileSystemOverloadScheme.
Also, the mount table name should be bucketAll in mount point configurations. Now, when user call s3a://bucketAll/data/marketing, the call will be delegated to s3a://bucketX/data/marketing. Similarly, s3a://bucketAll/data/sales, s3a://bucketAll/backup will be delegated to s3a://bucketY/data/sales, s3a://bucketA/backup respectively.
User’s View is with a merged view of mount links and linkFallback
The below configurations highlights the case where users may want to add mount links with respect to the base cluster. Let’s say the user has an existing HDFS cluster (base cluster) with URI hdfs://ns1/. The base cluster can be added as a default cluster in ViewFS, that is via the linkFallback configuration. Please check the below configurations and Fig: 3 for better understanding.
<property>
<name>fs.viewfs.mounttable.ns1.link./data</name>
<value>o3fs://bucket.vol.ozone1/data</value>
</property>
<property>
<name>fs.viewfs.mounttable.ns1.link./backup</name>
<value>hdfs://ns2/backup/</value>
</property>
<property>
<name>fs.viewfs.mounttable.ns1.linkFallback</name>
<value>hdfs://ns1/</value>
</property>
Fig: 3
Now the user can continue to work with his base cluster. Any calls with a path not starting with mount links (/data or /backup), will automatically go to linkFallback (hdfs://ns1). Here, we don’t need to add /user mapping, because the user exists in base cluster, so even if there is no mapping with /user in mount links, the call goes to hdfs://ns1/user as hdfs://ns1 is the linkFallback.
If a user does ‘ls’ on root, he can see the merged view with mount link paths and linkFallback cluster.
[root@umag-1 ~]# sudo -u hdfs hdfs dfs -ls /
Found 3 items
drwxrwxrwx - hdfs hdfs 0 2020-07-24 23:22 /user
drwxrwxrwx - hdfs hdfs 0 2020-07-24 23:22 /data
drwxrwxrwx - hdfs hdfs 0 2020-07-24 23:22 /backup
drwxrwxrwx - hdfs hdfs 0 2020-07-24 23:22 /warehouse
In the above picture, /backup exists in the base cluster. There is a mount link configured with respect to the base cluster, that is the /backup folder mounted to ns2. So, any calls starting with /backup go to ns2. Files under the /backup folder at the base cluster(ns1) will be shaded.
Now the question is, in above Fig:3, how can I access the backupFile from the original base cluster now?
Since we added a mount link to the /backup folder, calls will always redirect to the mounted target file system, i.e, ns2 here. The /backup folder in the base cluster is shaded by mount point. Since the base cluster is HDFS and ViewDistributedFileSystem also points to HDFS, we need a special way to access the child cluster directly. To bypass the mount filesystem route, we can use the below option to access the shaded files under mount directory.
[root@umag-1 ~]# sudo -u hdfs hdfs dfs -D fs.hdfs.impl=org.apache.hadoop.hdfs.DistributedFileSystem -ls /backup
We expect this to be done by admins only because this is an underneath cluster from the user’s point of view. Please read the blog post about the ViewDistributedFileSystem use case with Hive to understand how it can be used in real-time scenarios. All of this work has been done in the JIRA HDFS-15289 and will be available in the upcoming release Apache Hadoop 3.4.
... View more
Labels: