Hi, I have a requirement which sounds similar to symlinking in Hadoop/HDFS.
There are 2 production clusters: Cluster 1 and Cluster 2
I want to read data of cluster 1 from cluster 2 without copying it.
What came to my mind is, can I use hadoop fs -ls hdfs://namespace1/user/xyz on cluster 2?
I understand that cluster 2 won't know what is namespace1 - but thought of putting/appending namespace ID related info in hdfs-site.xml of cluster 2. (via advanced snippet - gateway configs)
Is this possible?
Any other alternative? hftp? (never tried both)
I have mentioned that, as I am reading data from Cluster 1, I am using hdfs://nameservice1:/user/abc on cluster 2.
nameservice1 is related to namenodes of cluster 1, so what is the issue?
I was replying to the idea of symlink.
If you just want to access data from Cluster1 on Cluster2 (or anywhere else), make sure your hdfs config files for the client points to the Cluster1. I think it is hdfs-site.xml
I suggest, you can create two linux user account for cluster1 and cluster2 respectively and configure .bashrc.