About bgooley

bgooley · ‎10-08-2018

@Paulina, The next step, then, is to find out what LDAP commands are being issued on the LDAP server. Since you are using a non-AD server, your LDAP access logs should give us that information. I would tail the ldap server access logs while doing the "hdfs groups hdfs" command for instance. Once we see what ldap commands are run, we should have a better idea of why we are seeing the results we are seeing. Also, you could use Wireshark or tcpdump to capture packets and view them in Wireshark. Wireshark decodes LDAP packets so we can actually view the complete conversation between client and server.

bgooley · ‎10-08-2018

@JoaoBarreto, It is possible to do so, but it is not something we document or test exactly. The closest steps would be only doing the Cloudera Manager steps here: https://www.cloudera.com/documentation/enterprise/5-15-x/topics/install_singleuser_reqts.html#concept_ivr_lng_2v Since the agent runs as "root" by default, you shouldn't need to do any of the agent-specific steps there. Also, the Management Service "System User" and "System Group" are "cloudera-scm" by default. You may need to do a little trial and error to get everything working, but I don't think it would be too bad.

bgooley · ‎10-05-2018

@Paulina, It appears that your user search filter may need adjustment. You have (objectclass=posixAccount) but are missing the part of the filter that accepts the uid. You might try the following in Hadoop User Group Mapping LDAP User Search Filter: (&(objectClass=posixAccount)(uid={0})) I don't have time at the moment to go through the code more and figure out why you are seeing this exact behavior, but I think the above change is a good start.

bgooley · ‎10-05-2018

@sid2707, Since they behavior you describe matches a known issue with a particular version of Kerberos, I think that is a good place to look first: https://bugzilla.redhat.com/show_bug.cgi?id=1560951 check your krb5 packages and make sure that you do not have: 1.15.1-18.el7 If you do, that is known to caue problems for Java Kerberos. Upgrading mit kerberos packages to 1.15.1-19 has been known to solve the trick

bgooley · ‎10-04-2018

@desind, That indicates there is a client communicating via TLS to CM but that client does not trust the signer of the Cloudera Manager certificate. The fact that the thread is scm-web-22 indicates that this is a connection to Cloudera Manager on port 7183. The trouble is that there is not a good way of identifying what IP the failed client requests are coming from. I'd start by considering what talks to Cloudera Manager on port 7183. The first that comes to mind are all the Management Service roles (Service Monitor, Host Montor, Navigator, etc.) If you enable TLS in Cloudera Manager's web UI, you need to make sure you have added a valid truststore to the following: Cloudera Management Service --> Configuration TLS/SSL Client Truststore File Location Cloudera Manager Server TLS/SSL Certificate Trust Store Password After that you will need to restart the Management Service (if you don't already have one) If you already have trust configured, find out if you have any clients making API calls to CM perhaps.

bgooley · ‎10-04-2018

@mwol As of 6.0 we (Cloudera) no longer support/build on debian: https://www.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_deprecated_items.html#concept_ylw_bc2_rbb Sorry to be dissappoint. We continue support for debian on 5.x though. -Ben

bgooley · ‎10-04-2018

@VeljkoC, I moved this post in front of the HDFS community members since it targets problem with HDFS rather than CM itself. Can you provide the actions you took and evidence of what happened (provide logs and/or screenshots?) Thanks!

bgooley · ‎10-03-2018

@roychan, Nice sleuthing... indeed you hit that buggy version. Here is more info on the issue: https://bugzilla.redhat.com/show_bug.cgi?id=1560951

bgooley · ‎10-03-2018

@Paulina, What the result is telling you is that the the configuration you have for groups mapping in HDFS is returning that result. Based on your LDAP and passwd output, it appears that your cluster is using a different means to derive group membership that you thought. Let's check the following to confirm what clients and HDFS NameNodes are using for their groups mapping: (1) Client: Run this on any host that is part of your cluster: # grep -B1 -A2 "hadoop.security.group.mapping" /etc/hadoop/conf/core-site.xml (2) NameNode: On your NameNode host, run the following: # grep -B1 -A2 hadoop.security.group.mapping /var/run/cloudera-scm-agent/process/`ls -lrt /var/run/cloudera-scm-agent/process/ | awk '{print $9}' |grep "\-NAMENODE$"| tail -1`/core-site.xml We can use the information there to help understand where hdfs is getting its mapping for your users. The fact the "hdfs" user mapping to something other than "hdfs hadoop" for groups is disconcerting indeed. I should also note that there is one other thing to consider. If someone there has added values in the following property, that will create a static mapping that will override any other groups mapping: hadoop.user.group.static.mapping.overrides Check your server/client core-site.xml for that just to be case. By default, it won't appear in core-site.xml

bgooley · ‎10-02-2018

@roychan, Are you saying that if you restart the DataNodes that this issue happens right away? The DataNode, for RPC communication, will get a TGT (Kerberos Ticket Granting Ticket) via UserGroupInformation.loginUserFromKeytab() This means that there is no visible cache file you can view to see the experiation time. The exception in the stack trace means that there was a TGT acquired and stored in memory, but when there was an attempt to get s Service Ticket to connect to the Active NameNode, the KDC responded that it could not process the request since the TGT had expired. With "loginUserFromKeytab()" if the RPC connection fails, the code has built-in mechanisms that will attempt to handle the condition and re-login. What happens after this message? Does the DataNode eventually recover from this?

Online	Offline
Last Visited	‎04-24-2020 01:13 PM

Member Since	‎04-22-2014 02:47 PM
Last Visited	‎04-24-2020 01:13 PM
Posts	1,218
Kudos received	339

Cloudera Community

Re: ALL hadoop-mapreduce-examples.jar fail cdh6

Re: YARN NodeManagers failed to start with permiss...

Re: Disable admin Login in Cloudera Manager

Re: Kerberos not authenticating from Hadoop Gatewa...

Re: Sqoop connection to Kerberos authenticated RDB...

Re: The application won't work without a running H...

Re: Changing installation username from "cloudera-...

Re: The application won't work without a running H...

Re: Data node failing after enabling kerberos

Re: avax.net.ssl.SSLException: Received fatal aler...

Re: CDH 6 Supported OS

Re: Datanodes failed to start the role because hdf...

Re: Kerberos ticket expired ( kinit keytab success...

Re: The application won't work without a running H...

Re: Kerberos ticket expired ( kinit keytab success...