New Contributor
Posts: 4
Registered: ‎12-13-2013

Cluster of clusters?

Suppose I have more than one Hadoop cluster. Is there any way to run a MapReduce job (or Hive query) across the multiple clusters?


I might have more than one HDFS cluster for various admin or data organization reasons, but want to run a job that scans all the data. Perhaps there would be a small Hadoop cluster that is a front-end to the other (larger) clusters.


Has anyone heard of this? Does it makes sense?


Thank you,



Posts: 1,825
Kudos: 406
Solutions: 292
Registered: ‎07-31-2013

Re: Cluster of clusters?

There's no existing method in Apache Hadoop currently that can encompass multiple MR clusters for a single job. You can, however, use multiple HDFS cluster's input URIs inside a single job.