About bgooley

bgooley · ‎07-19-2019

@mmmunafo, Thanks for the clear description. I did notice, when perusing Hue source, that KRB5CCNAME was set and was curious why. It was the only thing I could think of that would be influencing PAM behavior. Very cool... let me see if I can figure out why we are adding KRB5CCNAME and if we can get rid of it

bgooley · ‎07-18-2019

@Debashish, yup, stderr.log looks good. Please check your Region Server log file on that host (usually it is in /var/log/hbase with the name REGIONSERVER in it). I am guessing you will see an attempt to start up and then a failure of sorts.

bgooley · ‎07-18-2019

@mmmunafo, I upgraded to 6.2 and PAM using the default "login" module doesn't cause the credentials cache to be deleted. Any other ideas of how these two things could be tied in? Perhaps the PAM module is doing something special, but it is the PID of the Hue process that is in the audit...

bgooley · ‎07-18-2019

@mmmunafo, WOW! I see what you mean! How did you track this down to happening when a user logs out of Hue? I have to admit that PAM auth is used less than most other forms of authentication, so it is possible that no one has hit some sort of issue or bug. If you have the runcpserver.log that shows time when a user is logging out, I'd be interested in seeing it in case it yields any clues. In the meantime, I have a cluster I haven't upgraded to CDH 6.2 yet, so I'll set up PAM auth in 6.1.1 and then upgrade just to see. Might be a couple days, but I'll let you know if I think of anything in the meantime.

bgooley · ‎07-18-2019

@mmmunafo, What "system" are you users logging out of? If it is your OS, it sounds like a kdestroy is being done and your users' cache should not be tied to Hue's cache which is intended only for the Hue process on that host. Hue does not execute a "kdestroy" command so it is very unlikely that Hue is removing the credentials cache. Happy to help, but I think we need some more details of what you are observing to understand the problem. Cheers, Ben

bgooley · ‎07-17-2019

@Debashish, let's make sure that we can confirm the cause of your problem first, though, before pursuing the znode stuff.

bgooley · ‎07-17-2019

@Debashish, The "supervisor_status" file is not used by hbase, so this message should be non-fatal. I'd check to see if the stderr.log file for the region server process and see if there were any other messages after that Permission Denied. My guess is that the region server log file may contain errors regarding znodes. If Hbase created an Zookeeper znodes while Kerberos was enabled, then the znodes will require auth and will need to be recreated. I think the solution I mentioned here should help: https://community.cloudera.com/t5/Cloudera-Manager-Installation/Disabling-Kerberos-on-Cloudera-EXpress-5-5-1-HBase-issue/m-p/42535/highlight/true#M7642

bgooley · ‎07-17-2019

@TCloud, The configuration we want is the one that got us the following: WrongHost: Peer certificate subjectAltName does not match host, expected srv-c01.mws.mds.xyz, got DNS:cm-r01nn01.mws.mds.xyz This error means that the Cloudera Manager certificate only contains a SAN or CN subject value of cm-r01nn01.mws.mds.xyz. Since the agent is configured to connect to srv-c01.mws.mds.xyz, it attempts to validate that the certificate is valid for srv-c01.mws.mds.xyz. This situation is addressed here: https://www.cloudera.com/documentation/enterprise/latest/topics/admin_cm_ha_tls.html#cloudera-manager-server-cert-requirements-for-HA In order to make sure that clients can connect to CM by using both srv-c01.mws.mds.xyz and cm-r01nn01.mws.mds.xyz, we need to create a self-signed certificate that contains both in Subject Alternative Name. For a self-signed certificate, you could use: keytool -keystore testkeystore.jks -storepass password -keypass password -alias cm-r01nn01.mws.mds.xyz -genkeypair -keysize 2048 -keyalg RSA -dname "CN=cm-r01nn01.mws.mds.xyz" -ext san=dns:cm-r01nn01.mws.mds.xyz,dns:srv-c01.mws.mds.xyz If you do recreate the CM certificate like that, you will need to also replace the previous certifiate with this one in any trust store you created since a new key pair was created. Although it might require a bit more doing, the above should address the error you get when using TLS pass-through in HAProxy. Next, we need to make sure that HAProxy routes requests to your primary CM host every time and only routes to the other host in the event of the primary host's failure. I believe this can be achieved by removing "balance roundrobin" but I'm not sure. I feel like it may make sense to use "backup" directives in the server configuration for nn02 but I'm not sure... seems our example doesn't feel it is necessary.

bgooley · ‎07-17-2019

@TCloud, Have you tried it and it failed? If so, what was the problem. You configure the agent with a hostname and a port that it will use to send heartbeats to that host and port. If you have TLS enabled, then the same rules apply: If the client (agent) is doing validation, then it must be able to trust the signer of the CM certificate and it must be able to validate that the hostname it connected to is included in the certificate (in Subject Alt Name or CN subject). If you are doing agent authentication to CM, then CM must trust the signer of the certificate presented by the agent. I don't know if TLS termination at the balancer will work unless the balancer can authenticate. I'd recommend against termination with heartbeats.

bgooley · ‎07-17-2019

@Roroka, Since the agent has not been able to heartbeat to Cloudera Manager, it does not know what parcels it needs, so the error you observe regarding "active_parcels.json" is occurring due to an earlier problem. Can you take a closer look at the cloudera-scm-agent.log to see what the first exception (probably mentions "heartbeat"). If you can include that and 10 or more lines before and after, that should help give us some context for the problem. If you can, also share with us the output of the following command when run on the host that is not able to heartbeat: grep -v -e '^[[:space:]]*$' -e '^#' /etc/cloudera-scm-agent/config.ini Thanks!

Online	Offline
Last Visited	‎04-24-2020 01:13 PM

Member Since	‎04-22-2014 02:47 PM
Last Visited	‎04-24-2020 01:13 PM
Posts	1,218
Kudos received	339

Cloudera Community

Re: ALL hadoop-mapreduce-examples.jar fail cdh6

Re: YARN NodeManagers failed to start with permiss...

Re: Disable admin Login in Cloudera Manager

Re: Kerberos not authenticating from Hadoop Gatewa...

Re: Sqoop connection to Kerberos authenticated RDB...

Re: Cannot access: //. Note: you are a Hue admin b...

Re: After disabling Kerberos on Cloudera 6.2.0, Un...

Re: Cannot access: //. Note: you are a Hue admin b...

Re: Cannot access: //. Note: you are a Hue admin b...

Re: Cannot access: //. Note: you are a Hue admin b...

Re: After disabling Kerberos on Cloudera 6.2.0, Un...

Re: After disabling Kerberos on Cloudera 6.2.0, Un...

Re: SSLError: certificate verify failed

Re: SSLError: certificate verify failed

Re: Installation failed. Failed to receive heartbe...