Support Questions

Find answers, ask questions, and share your expertise

could not list the files in remote HDFS cluster

avatar
Explorer

More info. on the cluster:

we have cluster A & B.

cluster A - HDP 2.3 - not kerberised

cluster B - HDP 2.4 - kerberised

Action:

  • we are trying to list the directories in cluster A by issuing the command from one of the client machine of cluster B.

"hadoop fs -Dfs.defaultFS=<A namenode IP:port> -ls / "

OR

"hadoop fs -ls hdfs://<A namenode IP:port>/"

  • we are able to list the directories of cluster B from cluster A client machine.

Expectation Result:

should list the directories from cluster A HDFS

Actual Result:

we are getting the list of files in the current B instead of cluster A. Reverse listing is working fine!

Request:

Please provide some pointers on what could have been the issue to enable -D option or what actually is stopping us to list the directories in the remote cluster.

1 ACCEPTED SOLUTION

avatar
Explorer

SOLUTION FOUND:

since we copy the data from unsecure cluster to secure cluster we need to set the property "ipc.client.fallback-to-simple-auth-allowed" to true in the secured cluster. We added the host entries of the unsecured cluster in the secured cluster. Now everything is working fine!

View solution in original post

4 REPLIES 4

avatar
Expert Contributor

My guess is that your local "Cluster A" config values are superseding your use of the "-D" option to overwrite the defaultFS parameter. E.g. your local Cluster A values may have higher priority.

I would have expected that your second command with "hadoop fs -ls" should work to display the remote clusters file directory. Perhaps there was a typo or some other reason why this is not being picked up?

Could you alternatively use WebHDFS command via REST API (bash or Python) to list directories?

https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#LISTSTATUS

avatar
Explorer

Thanks for the alternatives Wes Floyd, It works one way again.

WEB HDFS :

Yes WEBHDFS works from cluster A again.

From Cluster B we use a numeric UserID because of which webhdfs call is failing, I see a patch for this anyways https://issues.apache.org/jira/browse/HDFS-4983. Is this patch not available as part HDP 2.4 !

Other cases:

We tried to add -fs option same result

Most weird part is hadoop fs -ls hdfs://a.a.a.a:8020, it still not throwing error it still gives the values of the cluster B.

Extra Cluster Info:

Cluster B has WanDisco installed, is this setup causing any issue. I hope we use WanDisco fusion client Jar instead of the hadoop Jars at the backend. Any pointers here!

core_site.xml Entries:

<property> <name>fs.hdfs.impl</name> <value>com.wandisco.fs.client.FusionHdfs</value> </property>

<property> <name>fusion.handshakeToken.dir</name> <value>##SOME DIR##</value> </property>

<property> <name>fusion.ihc.http.policy</name> <value>HTTP_ONLY</value> </property>

<property> <name>fusion.ihc.ssl.enabled</name> <value>true</value> </property>

avatar
Explorer

SOLUTION FOUND:

since we copy the data from unsecure cluster to secure cluster we need to set the property "ipc.client.fallback-to-simple-auth-allowed" to true in the secured cluster. We added the host entries of the unsecured cluster in the secured cluster. Now everything is working fine!

avatar
Explorer

What do you mean by "added host entries of the unsecured cluster in the secured cluster"?? Thats kind of weird. I thought we only add Hosts so that they can be a part of the current cluster.... If you did not add the unsecure host, what problem did you encounter?