I have two clusters which are independent and need to do distcp.
ClusterA (has webhdfs enabled)
ClusterB(which needs to fetch data)
on ClusterB, if I can access data using Hadoop cli,
- hadoop fs -ls hdfs://clusterA/user/sanjeev/files/yyyy/mm/dd/hh
- hadoop fs -ls webhdfs://clusterA/user/sanjeev/files/files/yyyy/mm/dd/hh
I can do distcp via cli on clusterB,
but when I schedule oozie, which uses oozie-distcp-action, I am getting
Error: E0803 : E0803: IO error, Unauthorized connection for super-user: oozie from IP 220.127.116.11
does clusterA, requires my oozie hostname/ip to be present in core-site.xml?
or is there any other better way when webhdfs is enabled on clusterA, to do distcp to clusterB?