Reply
New Contributor
Posts: 4
Registered: ‎12-13-2013

Cluster of clusters?

Suppose I have more than one Hadoop cluster. Is there any way to run a MapReduce job (or Hive query) across the multiple clusters?

 

I might have more than one HDFS cluster for various admin or data organization reasons, but want to run a job that scans all the data. Perhaps there would be a small Hadoop cluster that is a front-end to the other (larger) clusters.

 

Has anyone heard of this? Does it makes sense?

 

Thank you,

Chuck

 

Posts: 1,825
Kudos: 406
Solutions: 292
Registered: ‎07-31-2013

Re: Cluster of clusters?

There's no existing method in Apache Hadoop currently that can encompass multiple MR clusters for a single job. You can, however, use multiple HDFS cluster's input URIs inside a single job.
Announcements