About rpul

rpul · ‎07-07-2016

We're using a capacity scheduler on YARN with several queues. One of the queues is reserved for Spark notebooks (like jupyter/zeppelin). Many of our users leave their notebooks open for days on end. They are not using the resources they claimed (CPU and memory) most of the time. What would be a good configuration for this use case? Is it possible to configure YARN/Spark in such a way that inactive notebooks do not hinder other users?

rpul · ‎06-17-2016

@sseethana: Thanks for the info! Is any of this functionality already available in a released Hadoop distribution of Apache or Hortonworks? If so, is there any documentation or a getting started guide?

rpul · ‎06-15-2016

@sseethana: You seem to have been working a lot on YARN-3611. Can you give an update on the current efforts and status of running Docker on YARN?

rpul · ‎06-13-2016

Thanks for the JIRA's Alex! The changes to them are over 1 year old. Do you think DCE will be supported on a kerberized cluster? DCE seems like the way to go for running isolated jobs in production while retaining data locality. Any thoughts on why these JIRA's aren't picked up?

rpul · ‎06-10-2016

The hadoop documentation states that DCE does not support a cluster with secure mode (Kerberos): https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html Are people working on this? Is there a way around this limitation?

Online	Offline
Last Visited	‎08-22-2016 02:25 PM

Member Since	‎06-10-2016 12:56 PM
Last Visited	‎08-22-2016 02:25 PM
Posts	5
Kudos received	1

Cloudera Community

Configuring YARN queues for Spark notebooks

Re: Can I run DCE (Docker Container Executor) on Y...

Re: Can I run DCE (Docker Container Executor) on Y...

Re: Can I run DCE (Docker Container Executor) on Y...

Can I run DCE (Docker Container Executor) on Yarn ...