Member since
11-12-2018
11
Posts
0
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2634 | 12-19-2018 04:08 AM | |
5362 | 12-19-2018 03:56 AM | |
3399 | 12-15-2018 11:57 PM |
12-19-2018
04:08 AM
I needed to restart the cloudera scm server by running the following command on the cluster where cloudera manager is installed: systemctl restart cloudera-scm-server
systemctl restart cloudera-scm-agent
... View more
12-19-2018
03:56 AM
I needed to restart the cloudera scm server by running the following command on the cluster where cloudera manager is installed: systemctl restart cloudera-scm-server
systemctl restart cloudera-scm-agent
... View more
12-15-2018
11:57 PM
So the problem was with Snapshots. I had configured snapshots a long time ago on the /user/hive/warehouse directory, and they were still being generated. I was finding the space using the commands hadoop fs -du -h /user/hive hadoop fs -du -h /user/hive/warehouse Snapshot directories can be found using command: hdfs lsSnapshottabledir hadoop fs -delteSnapshot <path without .snapshot> <snapshotname>
... View more
11-14-2018
05:39 PM
2 Kudos
@orak, OpenLDAP is just fine for hadoop LDAP purposes. Active Directory is part of many existing IT infrastructures, so it is often used due to the way it does combine LDAP and Kerberos (along with other things). Users in your Kerberos KDC and LDAP server do not necessarily need to originate in the same object. Any true relationship between the two where the KDC principal exists in an end user object that is used for authentication would exist due to some sort of integration at the KDC / LDAP server level. This is not necessary for hadoop services to work. In general, there are 3 needs if you are going to secure your cluster with Kerberos: - Kerberos - means of mapping users to groups (usually OS shell-based, but can be LDAP based) - OS users as which services will run and end user OS users for YARN containers (running MR jobs) If I kinit as bgooley@EXAMPLE.COM and then attempt to perform a listing on a directory that is read for user/group and owned by someone else, then the NameNode must be able to determine if the user is a member of the group who has permission to list files. The principal would be trimmed to a "short name" by trimming off the realm to arrive at bgooley. The user bgooley's group membership would then be determined (shell group mapping or ldap group mapping) . See the following for details: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/GroupsMapping.html This mapping is used by several services so it is part of core hadoop. Then, you have the OS users that must exist at the OS level so that various processes can start as those users and files be owned. Also YARN containers will store information in the OS file system as the user running the job. This means that users who run jobs need to exist on all nodes in the cluster. Some of these topics are covered in a bit more detail here: https://www.cloudera.com/documentation/enterprise/latest/topics/sg_auth_overview.html That's a lot to process, so I'll stop there and wait to see if you have any questions.
... View more