Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Limiting user resources for Zep->Livy->Spark->Yarn

Solved Go to solution
Highlighted

Limiting user resources for Zep->Livy->Spark->Yarn

Currently on HDP 2.5, and looking at options for creating groups of users, based on experience, and limiting access to resources based these groups. For example beginner group users may get 10gb/4vcpu, intermediate gets 50gb/8vcpu, expert users get 200gb/16vcpu. The stack we are interested in is: Zeppelin -> Livy -> Spark -> YARN cluster

Initially looking at doing this via YARN queues (i.e. user A can only submit jobs to the beginner queue), but from what I can tell, the Zeppelin Spark interpreter config only allows configuration of one queue, so regardless of user, all jobs would be sent to the same queue. Another option would be to have multiple Zeppelin's, and configure different queues, but from what I can tell, if we have multiple Zeppelins all managed via Ambari, they will all get the same config.

Am I off base? Is there a better way to limit resources (primarily memory) based on the user?

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Limiting user resources for Zep->Livy->Spark->Yarn

@Matt Cable

One of the ways you can do is to configure multiple livy interpreter instances (e.g. %livy_begineer, %livy_expert etc) in the same zeppelin instance (follow steps here: https://zeppelin.apache.org/docs/latest/manual/interpreters.html) and configure separate yarn queues for each instance of the livy interpreter. Since livy also supports user impersonation, this would serve the purpose for even restricting access to certain queues only by certain users

View solution in original post

3 REPLIES 3
Highlighted

Re: Limiting user resources for Zep->Livy->Spark->Yarn

@Matt Cable

One of the ways you can do is to configure multiple livy interpreter instances (e.g. %livy_begineer, %livy_expert etc) in the same zeppelin instance (follow steps here: https://zeppelin.apache.org/docs/latest/manual/interpreters.html) and configure separate yarn queues for each instance of the livy interpreter. Since livy also supports user impersonation, this would serve the purpose for even restricting access to certain queues only by certain users

View solution in original post

Re: Limiting user resources for Zep->Livy->Spark->Yarn

Thanks Kshitij - will go down this route, as it seems to be a good fit.

Highlighted

Re: Limiting user resources for Zep->Livy->Spark->Yarn

@Matt Cable

Can you please accept the answer if it helped? Thanks in advance :-)

Don't have an account?
Coming from Hortonworks? Activate your account here