I am trying to use Airflow to be able to schedule scalding MapReduce jobs in my hadoop cluster.
In terms of architecture I have one node with Airflow installed, and 2 HA nodes with YARN on them.
I am wanting to remotely run the command to start the mapreduce job through SSH or some other means.
What is the best way / best practice to do this?
Some things I am thinking of:
Any insight would be much appreciated.