Support Questions

Find answers, ask questions, and share your expertise

User not returning any groups for hdfs groups <ID>

avatar
Explorer

Hello,

 

I have one user ID which is not returning any groups for hdfs groups <ID>. However, groups <ID> is giving proper group mapping. Any thoughts? 

10 REPLIES 10

avatar
Mentor
Where are you executing this in your cluster?

The way 'hdfs groups' works is by sending an RPC request with the username to one of the NameNodes. When using the default ShellBasedUnixGroups plugin, the NameNode that received the request will run a 'id -gn username' command as a forked process on its own host and collect the output.

The key point here is that the groups check is not done on your host of invocation, as that'd be insecure to perform, it is done on the host of the service that is required to authorize a given request.

It is therefore critical that all hosts in the cluster report consistently the same group results for any given username. You can typically use a centralized identity management system with SSSD on Linux to achieve this (there are other ways too), instead of using local Linux /etc/passwd and /etc/group files to manage it (can get hairy to keep synced as the cluster grows).

For more behind the basics of auth(z), read http://blog.cloudera.com/blog/2012/03/authorization-and-authentication-in-hadoop/

avatar
Explorer

Thanks, Harsh for your reply.

 

I am executing this from gateway node. I am using SSSD and able to fetch right groups using "groups <ID>" command. However, "hdfs groups" is not showing any groups. This is the same when checked from other nodes in the cluster as well. This is happening to only one particular user. 

avatar
Mentor
More specifically, what does 'groups username' report on all your NameNode
hosts?

Per the earlier post, the other hosts won't matter for a 'hdfs groups'
command check, only (all) your NameNode hosts' outputs would matter.

P.s. This is assuming you're using the shell based plugin in NameNode
configuration.

avatar
Explorer

Hi,

 

I am getting the same outputs in my name nodes as well.

 

 

#groups <user ID>

Returns proper group mapping.

 

# hdfs groups <user ID>

No groups returned.

 

 

This is happening only for a specific user account and we are using ShellBasedUnixGroupsMapping. 

Sample log:

++++++++

org.apache.hadoop.security.ShellBasedUnixGroupsMapping: unable to return groups for user IPartialGroupNameException can't execute the shell command to get the list of group id for user 'ID' at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.resolvePartialGroupNames(ShellBasedUnixGroupsMapping.java:228)

+++++++

 

 

avatar
Mentor
Thank you for confirming the verification over NameNode host(s).

The PartialGroupNameException will particularly trigger when the 'id -gn username && id -Gn username' returns some output but does not exit with a return code of 0. This is usually observed when the id command is unable to fully resolve all presented groups, which is likely what's happening.

- Do any of the outputs in the groups command you run return pure numeric results, instead of actual string names?
- What's the exit code after you execute 'id -gn username' for the affected user? You may run 'echo $?' to grab exit code after the command.
- Please paste the full stack trace, which should include a trace of an IOException after the log message as an underlying 'Caused by'. This would explain the reason behind why the partial group resolution further fails.
- Is there any particular difference to this username vs. others? For ex., does it start with a special character instead of alpha-num, etc.?

avatar
Explorer

- Do any of the outputs in the groups command you run return pure numeric results, instead of actual string names?

No.


- What's the exit code after you execute 'id -gn username' for the affected user? You may run 'echo $?' to grab exit code after the command.

 

$ id -gn user ; echo $?
1
$


- Please paste the full stack trace, which should include a trace of an IOException after the log message as an underlying 'Caused by'. This would explain the reason behind why the partial group resolution further fails.

+++++++

2018-08-07 15:17:35,638 WARN org.apache.sentry.provider.common.HadoopGroupMappingService: [HiveServer2-Handler-Pool: Thread-2934561]: Unable to obtain groups for <user> java.io.IOException: No groups found for user <user> at org.apache.hadoop.security.Groups.noGroupsForUser(Groups.java:197) at org.apache.hadoop.security.Groups.getGroups(Groups.java:220) at org.apache.sentry.provider.common.HadoopGroupMappingService.getGroups(HadoopGroupMappingService.java:60) at org.apache.sentry.provider.common.ResourceAuthorizationProvider.getGroups(ResourceAuthorizationProvider.java:167) at org.apache.sentry.provider.common.ResourceAuthorizationProvider.doHasAccess(ResourceAuthorizationProvider.java:97) at org.apache.sentry.provider.common.ResourceAuthorizationProvider.hasAccess(ResourceAuthorizationProvider.java:91) at org.apache.sentry.binding.hive.authz.HiveAuthzBinding.authorize(HiveAuthzBinding.java:319) at org.apache.sentry.binding.hive.HiveAuthzBindingHook.filterShowDatabases(HiveAuthzBindingHook.java:907) at org.apache.sentry.binding.metastore.SentryMetaStoreFilterHook.filterDb(SentryMetaStoreFilterHook.java:131) at org.apache.sentry.binding.metastore.SentryMetaStoreFilterHook.filterDatabases(SentryMetaStoreFilterHook.java:59) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:1042) at sun.reflect.GeneratedMethodAccessor146.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:105) at com.sun.proxy.$Proxy19.getDatabases(Unknown Source) at sun.reflect.GeneratedMethodAccessor146.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2034) at com.sun.proxy.$Proxy19.getDatabases(Unknown Source) at org.apache.hive.service.cli.operation.GetSchemasOperation.runInternal(GetSchemasOperation.java:59) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:337) at org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:503) at org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:320) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:546) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1373) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1358) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

+++++++


- Is there any particular difference to this username vs. others? For ex., does it start with a special character instead of alpha-num, etc.?

 

Normal user account. 

avatar
Explorer

Any updates? 

avatar
Mentor

With the id command failing this is really a problem at a lower level than CDH and requires troubleshooting further at the OS and its group configuration layers. CDH components rely on a successful run of id, but the exit code of 1 indicates that's not the case, at least not for this user.

I'd recommend taking this up with a Linux support team if the command prints nothing useful in its stderr that could help trace what the problem is for this specific account. You could also try to see which underlying subsystem is failing by running it under strace and debugging further, and/or look at the sssd/other logs to catch the failure after you run it.

avatar
Explorer

Hello Harsh,

 

Thank you for your reply. I was able to narrow down the cause. It was due to the membership of this user account in a specific group. Once we removed the user from that group the issue got resolved.