Support Questions

Find answers, ask questions, and share your expertise
Announcements
Now Live: Explore expert insights and technical deep dives on the new Cloudera Community BlogsRead the Announcement

Spark job failure after Kerberos is enabled

avatar
Expert Contributor

The error from my Spark job is
++++
Failing this attempt.Diagnostics: Application application_1738011234567_0014 initialization failed (exitCode=255) with output: main : command provided 0
main : run as user is xxxx
main : requested yarn user is xxxx
User xxxx not found
++++

I read this post <https://community.cloudera.com/t5/Support-Questions/MapReduce-job-failing-after-kerberos/td-p/160273>. My group mapping configuration is hadoop.security.group.mapping = org.apache.hadoop.security.LdapGroupsMapping. I kinited xxxx before the job run. I added the AD user xxxx to an AD group hadoop. But I still got the same error.

This online doc might be appliable <https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/security-authorization/topics/cm-security-aut...>
I might need to add the flag -Dcom.cloudera.cmf.service.config.emitLdapBindPasswordInClientConfig=true to the variable CMF_JAVA_OPTS flag. But the documentation is for CDP 7.1.8 and does not exist for 7.1.7, which is my cluster.

Thank you.

Best regards,

1 ACCEPTED SOLUTION

avatar
Master Collaborator

It appears that the user 'xxxx' has not been synchronized back from LDAP to the local OS on the relevant host. There is a possibility that it could be due to misconfiguration on the AD/LDAP side, preventing correct username resolution and causing the synchronization to fail.  Resolve AD/LDAP side problem to overcome this problem. 

Also Document  for CDP 7.1.7 





View solution in original post

3 REPLIES 3

avatar
Master Collaborator

It appears that the user 'xxxx' has not been synchronized back from LDAP to the local OS on the relevant host. There is a possibility that it could be due to misconfiguration on the AD/LDAP side, preventing correct username resolution and causing the synchronization to fail.  Resolve AD/LDAP side problem to overcome this problem. 

Also Document  for CDP 7.1.7 





avatar
Expert Contributor

@ggangadharan Thanks for the advice.  After I created user xxx on each data node, the Spark job ran successfully.

Regarding user account synchronization from ldap to local OS, I had to create the user account on each node manually. Do you mean using SSSD?

Regards,

avatar
Master Collaborator

If the environment allows , use SSSD with LDAP integration to avoid manually creating Users. 
If that's not possible , use Ansible to automate user creation across all nodes.