To run Hadoop distcp command on a Cluster with NameNode High Availability (HA) enabled, the following is required:
* Adding of nameservice information of both Source and destination cluster
* Restarting of the services.
The reason being that YARN ResourceManager renews delegation tokens for applications.
To avoid server side configuration, the MapReduce jobs can send the configurations to RM at runtime and RM uses these configurations to renew tokens via mapreduce.job.send-token-conf
We can leverage the same via Oozie Distcp Action. Git Repo contains Oozie distcp Action template that would allow basic oozie distcp action on a Kerberos environment and help parameterize on runtime. This way end users can run at their schedule.