Posts: 48
Registered: ‎09-20-2017

Spark on Yarn fails with LDAP

[ Edited ]



I have configured CDH cluster with LDAP integration and CompositeGroupMapping (ShellBasedUnixGroupsMapping and LdapGroupsMapping) on HDFS. HDFS, Hive and Impala works great with both local user principals as well as AD users.


The problem I have now is with Spark (on YARN), where jobs submitted by local users work, but those submitted by AD users fail:


main : run as user is ldap1
main : requested yarn user is ldap1
User ldap1 not found


If I create user ldap1 on all hosts, then Spark works. What am I missing here?


Thank you

Posts: 1,903
Kudos: 435
Solutions: 305
Registered: ‎07-31-2013

Re: Spark on Yarn fails with LDAP

YARN in secure mode requires locally available user accounts to fully isolate the task containers:

You'll need to make these accounts visible to your Linux hosts via SSSD or similar software.