Support Questions
Find answers, ask questions, and share your expertise

HDFS Storage Allocation per User

Hi,

I just wanted to know the steps on how allocation of space works. Here's the scenario.

Assuming I have 3 clusters with 1TB of space. We have the default replication factor which is 3.

Assuming we have 3 clients who will share the in the clusters. Now, how does the allocation of space works with these. I know it depends on the requirements of the clients on how much storage will they be using but with the current setup, what would be the steps/procedure or calculation should be applied?

Thanks!

5 REPLIES 5

Re: HDFS Storage Allocation per User

@Bruce Perez, HDFS does not allocate capacity separately based on user. However, it is possible to use HDFS Quotas to enforce a limit on metadata consumption and space consumption by specific directories. A common setup is to create sub-directories dedicated to different users, apply HDFS Permissions on each directory to guarantee that only that user can write to the directory, and then set an appropriate quota on each directory. The permissions would guarantee that the user can only write to their directory. The quotas would limit metadata and space consumption by each user. The overall effect of this setup is that in a multi-tenant cluster, it prevents any one user from consuming all space in the cluster and harming processes of its other users.

Re: HDFS Storage Allocation per User

Hi @Bruce Perez, HDFS Quotas with per user directories are the right solution to go as @Chris Nauroth suggested.

If your users are untrusted you should also look into enabling Kerberos security. It is trivially easy to impersonate other users without Kerberos.

Re: HDFS Storage Allocation per User

Expert Contributor

@ArpitAgarwal  What if I'm using AD/LDAP for users, would that be fine too? Any major difference in security?

Re: HDFS Storage Allocation per User

Cloudera Employee

Hi

 

You can enable Kerberos authentication even if your users are present in AD/LDAP; There wont be any changes in Kerberos security.

Re: HDFS Storage Allocation per User

Super Collaborator

AD/LDAP integration is NOT a substitute for internal enforcement of authentication.

 

Basically if there is no kerberos, you cannot prevent users from impersonating any other hadoop user and doing 'anything they want'. 


- Dennis Jaheruddin

If this answer helped, please mark it as 'solved' and/or if it is valuable for future readers please apply 'kudos'. Also check out my techincal portfolio at https://portfolio.jaheruddin.nl