Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Adding a new user to the cluster

avatar

Say users are allowed to access a cluster from the edge node of a cluster. If the user wants to run jobs on the cluster, does the user should have his account on all the nodes of the cluster or just having an account on the edge node is enough?

1 ACCEPTED SOLUTION

avatar
Super Guru

Hi @ARUNKUMAR RAMASAMY

No. User should not have account on all the nodes of the cluster. He should only have account on edge node.

For a new user there are 2 types are directories we need to create before the user access the cluster. 1- User home directory [directory created on Linux Filesystem ie. /home/<username>] 2- User HDFS directory [directory created on HDFS filesystem ie. /user/<username>]

As per neeraj, you only need to create HDFS home directory[ie. /user/<username>] on edge node. You can still run jobs with the new user on cluster, even if you havent created his home directory in linux.

==============

Below are 2 scenarios -

a. I added new user on edge node using command - #useradd <username> Before launching job on cluster, i need to create hdfs directory for user #sudo -u hdfs hadoop fs -mkdir </user/{username}> #sudo -u hdfs hadoop fs chown -R <username>:<grp_name> </user/{username}>

b. If the user is coming from ldap server, then you only need to make your edge node as ldap client and create a directory in HDFS using below command -

#sudo -u hdfs hadoop fs -mkdir </user/{username}> #sudo -u hdfs hadoop fs chown -R <username>:<grp_name> </user/{username}>

Let me know if this clears, what you are looking for.

View solution in original post

11 REPLIES 11

avatar
Rising Star

hi @ARUNKUMAR RAMASAMY, @Sagar Shimpi I verified with removing user "vgadade" from datanode, please find below output on secure hadoop cluster

Here is sample job output..

16/02/17 01:47:24 INFO mapreduce.Job: Job job_1455179809801_0007 running in uber mode : false

16/02/17 01:47:24 INFO mapreduce.Job: map 0% reduce 0%

16/02/17 01:47:26 INFO mapreduce.Job: Task Id : attempt_1455179809801_0007_m_000001_0, Status : FAILED

Application application_1455179809801_0007 initialization failed (exitCode=255) with output: main : command provided 0

main : run as user is vgadade

main : requested yarn user is vgadade

User vgadade not found

16/02/17 01:47:29 INFO mapreduce.Job: Task Id : attempt_1455179809801_0007_m_000001_1, Status : FAILED

Application application_1455179809801_0007 initialization failed (exitCode=255) with output: main : command provided 0

main : run as user is vgadade

main : requested yarn user is vgadade

User vgadade not found

16/02/17 01:47:30 INFO mapreduce.Job: map 50% reduce 0%

16/02/17 01:47:32 INFO mapreduce.Job: Task Id : attempt_1455179809801_0007_m_000001_2, Status : FAILED

Application application_1455179809801_0007 initialization failed (exitCode=255) with output: main : command provided 0

main : run as user is vgadade

main : requested yarn user is vgadade

User vgadade not found

16/02/17 01:47:37 INFO mapreduce.Job: map 100% reduce 0%

16/02/17 01:47:43 INFO mapreduce.Job: map 100% reduce 100%

16/02/17 01:47:44 INFO mapreduce.Job: Job job_1455179809801_0007 completed successfully

Job Counters

Failed map tasks=3

Launched map tasks=5

Launched reduce tasks=1

Other local map tasks=3

Data-local map tasks=1

Rack-local map tasks=1

Job Finished in 28.409 seconds

Nodemanager log out file ouput

2016-02-17 01:47:24,944 WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit code from container container_e21_1455179809801_0007_01_000002 startLocalizer is : 255

java.io.IOException: Application application_1455179809801_0007 initialization failed (exitCode=255) with output: main : command provided 0

main : run as user is vgadade

main : requested yarn user is vgadade

User vgadade not found

at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:262)

2016-02-17 01:47:24,946 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e21_1455179809801_0007_01_000002 transitioned from LOCALIZING to LOCALIZATION_FAILED

vgadade user added again on datanode and executed job ( container is successfully launched on previous Data node)

Here is sample job output..

16/02/17 01:55:46 INFO mapreduce.Job: Running job: job_1455179809801_0008

16/02/17 01:55:52 INFO mapreduce.Job: Job job_1455179809801_0008 running in uber mode : false

16/02/17 01:55:52 INFO mapreduce.Job: map 0% reduce 0%

16/02/17 01:55:58 INFO mapreduce.Job: map 100% reduce 0%

16/02/17 01:56:04 INFO mapreduce.Job: map 100% reduce 100%

16/02/17 01:56:04 INFO mapreduce.Job: Job job_1455179809801_0008 completed successfully

16/02/17 01:56:05 INFO mapreduce.Job: Counters: 50

Job Counters

Launched map tasks=2

Launched reduce tasks=1

Data-local map tasks=1

Rack-local map tasks=1

Job Finished in 20.333 seconds

avatar

Thanks @Vikas Gadade. It helped.