Hi. I have a question regarding to Hadoop architecture. I have 10 node cluster and I want to create some kind of sandboxes inside the cluster.
What do I mean by sandbox? Separate space from the whole cluster resources when business users could create some temporary databases, files and run jobs.
A simple solution would be creating a technical user for every sandbox, but I don't want to do it that way. Business users have their own accounts and I want them to run jobs using those accounts.
Maybe the picture will say more:
As you can see, the problem is that some jobs should be run by users in main cluster space and some of jobs should be run in specific sandbox.
The question is - how I can achieve this? Does anyone have the idea?
Within a single cluster, the simplest way to segregate resources for multitenant use would be to use YARN queues. You can set minimum and maximum resource usage by queue and allow pre-emption if needed for critical jobs. If you need to limit access to specific hardware (e.g., a department has paid for its own physical nodes), you can use YARN node labels to isolate specific nodes.
If this response helps, please mark it as helpful so that it will appear in other searches - if not, please feel free to ask further questions as comments here!