Code Repositories
Find and share code repositories
Labels (2)
Expert Contributor
Repo Description

Background:

To run Hadoop distcp command on a Cluster with NameNode High Availability (HA) enabled, the following is required:

* Adding of nameservice information of both Source and destination cluster

* Restarting of the services.

The reason being that YARN ResourceManager renews delegation tokens for applications.

Solution:

To avoid server side configuration, the MapReduce jobs can send the configurations to RM at runtime and RM uses these configurations to renew tokens via mapreduce.job.send-token-conf

We can leverage the same via Oozie Distcp Action. Git Repo contains Oozie distcp Action template that would allow basic oozie distcp action on a Kerberos environment and help parameterize on runtime. This way end users can run at their schedule.

  1. job.properties
  2. workflow.xml
Repo Info
Github Repo URL https://github.com/saumilmayani/oozie-distcp_template.git
Github account name saumilmayani
Repo name oozie-distcp_template.git
921 Views