I have configured CDH cluster with LDAP integration and CompositeGroupMapping (ShellBasedUnixGroupsMapping and LdapGroupsMapping) on HDFS. HDFS, Hive and Impala works great with both local user principals as well as AD users.
The problem I have now is with Spark (on YARN), where jobs submitted by local users work, but those submitted by AD users fail:
main : run as user is ldap1
main : requested yarn user is ldap1
User ldap1 not found
If I create user ldap1 on all hosts, then Spark works. What am I missing here?