Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to handle remote MapReduce job?

Highlighted

How to handle remote MapReduce job?

New Contributor

I am trying to use Airflow to be able to schedule scalding MapReduce jobs in my hadoop cluster.

In terms of architecture I have one node with Airflow installed, and 2 HA nodes with YARN on them.

I am wanting to remotely run the command to start the mapreduce job through SSH or some other means.

What is the best way / best practice to do this?

Some things I am thinking of:

  • Storing a private key on the server which will be authorized on the other node (possible security risk)
  • Install the YARN client on the node

Any insight would be much appreciated.

Don't have an account?
Coming from Hortonworks? Activate your account here