Support Questions

Find answers, ask questions, and share your expertise

Running all services as same user

avatar

Hi,

Are there any concerns about running all services (hdfs, hive, ambari, etc) as the same user? In this case, 'root'?

Thanks,

1 ACCEPTED SOLUTION

avatar
  1. you shouldn't be running services as root, for obvious reasons.
  2. If you are on an insecure cluster, then all YARN jobs submitted will run as the service wide user. If that is "root", then your entire cluster belongs to the first malicious person running a job.
  3. If you are running on a kerberos cluster -as you should- you need separate accounts for every individual user of the cluster, so you aren't saving on any setup effort

View solution in original post

7 REPLIES 7

avatar
Master Mentor

@awatson@hortonworks.com Was cluster deployed using ambari?

avatar
@Neeraj Yes it was

avatar
Master Mentor

@awatson@hortonworks.com Interesting. In Ambari console, you changed all the users to root under misc?

avatar

I am not aware of any known issues - although I have not seen any such deployment. On the other hand, I think, Windows deployment of hadoop uses the same user for all services..

avatar

I strongly advise against running everything as a single user. There are accounts for controlling the infrastructure and for data access. Mashing them together only exposes the attack vector and basically throws security out the window.

If the drive was ti 'simplify' deployment and side-stepping corporate policies (and process) of creating new accounts, please re-consider.

avatar
Master Mentor

@awatson@hortonworks.com

From Security, Management & Troubleshooting prospecting , this is big No.

avatar
  1. you shouldn't be running services as root, for obvious reasons.
  2. If you are on an insecure cluster, then all YARN jobs submitted will run as the service wide user. If that is "root", then your entire cluster belongs to the first malicious person running a job.
  3. If you are running on a kerberos cluster -as you should- you need separate accounts for every individual user of the cluster, so you aren't saving on any setup effort