Member since
07-01-2015
460
Posts
78
Kudos Received
43
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1360 | 11-26-2019 11:47 PM | |
1309 | 11-25-2019 11:44 AM | |
9517 | 08-07-2019 12:48 AM | |
2193 | 04-17-2019 03:09 AM | |
3520 | 02-18-2019 12:23 AM |
02-18-2019
12:23 AM
1 Kudo
After many searches I think I have found the solution. None of the blogs on Cloudera or Hortonworks states, the solution, because I think in all cases the hosts running the clusters are using custom DNS. Thus the krb5.conf nicely maps with the cluster's REALM, or if not then a simple line of conf makes sure the mapping. In my case all the host names are managed by AWS DNS, thus no custom domain names used. This was the reason why my client tried to look up for the namenode in the local KDC, because it used the default_realm to get the service ticket. But after adding into krb5.conf in DEV node: [domain_realm]
ip-xx-xx-xx-xx.eu-west-1.compute.internal = PRODREALM ip-xx-xx-xx-xx.eu-west-1.compute.internal = PRODREALM
i.e.:
<fully_qualified_host_name_of_remote_namenode1> = <REMOTE REALM> <fully_qualified_host_name_of_remote_namenode2> = <REMOTE REALM> I was able to ls the remote HDFS. Now to the High availability part, I had to add adintional nameservice info into hdfs-site.xml: dfs.ha.namenodes.hanameservice <- ADD here the remote nameservice
dfs.namenode.rpc-address.* <- Add the remote nameservice FQDNs
dfs.namenode.https-address.* <- Add the remote nameservice FQDNs
dfs.namenode.http-address.* <- Add the remote nameservice FQDNs
dfs.namenode.servicerpc-address.* <- Add the remote nameservice FQDNs
The I was able to use HA nameservice name to access the HDFS. Also during distcp I had to use the: -Dmapreduce.job.hdfs-servers.token-renewal.exclude=name_of_the_prod_nameservice when launching a distcp from dev and copying data from prod to dev. And the answer to the last question, regarding HADOOP_CONF - I am not sure here, but I think hdfs scripts in cloudera bin are overriding this env variable, so regardless what you set to HADOOP_CONF, it will not be applied. So when Cloudera's guide states: export HADOOP_CONF_DIR=path_to_working_directory you have to be sure, that the script does not override this setting.
... View more
02-13-2019
11:29 PM
1 Kudo
Are you sure that the Kafka receives two messages? Or is it just that how the consumers displays the messages on the terminal? Kafka is key value based, so try to search in Flume logs if one or two messages were submitted to the topic.
... View more
02-13-2019
05:06 AM
Tested the corss realm auth based on suggestion from HarshJ:
https://community.cloudera.com/t5/Cloudera-Manager-Installation/Test-cross-realm-kerberos/m-p/32422#M5612
kinit user@REMOTEREALM kvno hdfs/namenode-host@LOCALREALM
Running on CLUSTERDEV:
kinit user@PRODREALM
kvno hdfs/dev_name_node_host@DEVRELAM
klist
Ticket cache: FILE:/tmp/krb5cc_1000
Default principal: user@PRODREALM
Valid starting Expires Service principal
02/13/2019 13:59:41 02/14/2019 13:59:41 krbtgt/PRODREALM@PRODREALM
renew until 02/20/2019 13:59:41
02/13/2019 14:00:05 02/14/2019 13:59:41 krbtgt/DEVREALM@PRODREALM
renew until 02/20/2019 13:59:41
02/13/2019 14:00:55 02/14/2019 13:59:41 hdfs/<dev namenode fqdn>@DEVREALM
renew until 02/18/2019 14:00:55
-> OK.
... View more
02-13-2019
03:47 AM
The second question is not clear.
... View more
02-13-2019
01:40 AM
Hi,
I got an error during an attempt to access a remote cluster secured by Kerberos and I dont know why the client is trying to find out the hdfs principal in the local KDC.
The setup is as follows (intentionally ommit full domain names and host names to keep it tidy):
- each cluster (CLUSTERDEV and CLUSTERPROD) has its own KDC (DEVREALM and PRODREALM)
- the KDC trusts each other (verified by kvno hdfs/<namenodehost>@REMOTE_REALM from boths sides)
- both clusters are running NameNode in HA mode
I have configured the Trusted realm in ClouderaManager for CLUSTERDEV set to CLUSTERPROD. (This triggered the RULE change in auth_to_local for core-site.xml). I have done the same for CLUSTERPROD and set trusted CLUSTERDEV.
- each krb5.conf in CLUSTERDEV has also a PRODREALM in [realms] (I can kinit with "remote" account)
- each krb5.conf has [capaths] DEVREALM = { PRODREALM = . }
- and vice versa, in CLUSTERPROD each krb5.conf the DEVRELAM is added to [realms] plus [capath] PRODREALM = { DEVREALM = . }
Now on the DEV cluster I want to access the PROD cluster:
I have prepared a custom hdfs-site, where I have added the PROD clusters' namenode info into a custom hdfs-site.xml stored in distcpconf (a copy of the actual hadoop-conf on a gateway host)
https://www.cloudera.com/documentation/enterprise/5-12-x/topics/cdh_admin_distcp_data_cluster_migrate.html#concept_yjz_2wn_bbb
Tried to test the new configuration (on a gateway host in DEV cluster):
HADOOP_CONF=/home/centos/distcpconf hdfs dfs -ls hdfs://prodnameservice/tmp
export HADOOP_CONF=/home/centos/distcpconf
hdfs dfs -ls hdfs://prodnameservice/tmp
None of the above worked, the client does not know the prodnameservice.
-ls: java.net.UnknownHostException: prodnameservice
First question: why the client is not taking into the account the modified env variable?
I had to put this custom hdfs-site.xml into /etc/hadoop/conf/ and then it suddenly know what is "prodnameservice".
The hdfs ls returns ( I have logged in as tomas2@PRODREALM on DEV gateway):
PriviledgedActionException as:user@PRODREALM (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Fail to create credential. (63) - No service creds)]
and in the same time DEV KDC reports:
TGS_REQ (2 etypes {18 17}) 10.85.150.42: LOOKING_UP_SERVER: authtime 0, user@PRODREALM for hdfs/prod.namenode.fqn@DEVREALM, Server not found in Kerberos database
and in the same time PROD KDC reports:
TGS_REQ (2 etypes {18 17}) 10.85.150.42: ISSUE: authtime 1550044165, etypes {rep=18 tkt=18 ses=18}, user@PRODREALM for krbtgt/DEVREALM@PRODREALM
So I dont understand why the client is trying to look for a hdfs/PRODUCTION_NAMENODE principal in DEV KDC. As you can see the PROD KDC correctly reports the ticket granting service for cross realm trust using krbtgt/DEV@PROD.
So I went back to the modified hdfs-site.xml and changed everything from DEV to PROD in these items, so it points now to the PROD:
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>hdfs/_HOST@PRODREALM</value>
</property>
<property>
<name>dfs.namenode.kerberos.internal.spnego.principal</name>
<value>HTTP/_HOST@PRODREALM</value>
</property>
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>hdfs/_HOST@PRODREALM</value>
</property>
Run again the ls with the same error results.
Then I reverted this change in hdfs-site.xml and changed the krb5.conf default_realm on the DEV gateway where I try to do the "ls".
After this I was able to do "ls" on the remote cluster, BUT I want to access the remote cluster without changing the default realm in the DEV krb5.conf gateway file.
[centos@ip-10-85-150-42 ~]$ hdfs dfs -ls hdfs://prodnameservice/tmp
Found 5 items
...
[centos@ip-10-85-150-42 ~]$ kinit tomas2@DEVREALM
Password for user@DEVREALM:
[centos@ip-10-85-150-42 ~]$ hdfs dfs -ls hdfs://prodnameservice/tmp
19/02/13 09:44:28 INFO util.KerberosName: No auth_to_local rules applied to user@DEVREALM
Found 5 items
...
as the client reports in the second case, the auth_to_local is not applied. But as I said before,the "Trusted Kerberos Realms"
<name>hadoop.security.auth_to_local</name>
<value>
RULE:[1:$1@$0](.*@\QDEVLREAM\E$)s/@\QDEVREALM\E$//
RULE:[2:$1@$0](.*@\QDEVREALM\E$)s/@\QDEVREALM\E$//
DEFAULT
</value>
(and the same rules are in DEV cluster but just with the opposite REALM).
Why it is not using the RULEs from core-site.xml?
And the most important question, why the hdfs client is trying to find the prod namenode in the DEV KDC? How can I do "ls" form DEV gateway without chaning the default realm?
Thanks for any advise,
T.
... View more
Labels:
- Labels:
-
Cloudera Manager
-
HDFS
-
Kerberos
02-05-2019
08:05 AM
Thanks for the answer, Yes I have the original configuration file, but as I understand, I cannot post this configuration file to a new Altus Director and tell "Hey, skip the CM creation, here is the existing CM instance, he is managing the KDC and principals - so talk with this instance instead of creating a new one". I was just curious, because through the Web UI of the Director I think it was possible to create a "copy" of the cluster under the same "deployment" - aka. under the same Cloudera Manager.
... View more
02-05-2019
01:27 AM
Hi,
is it supported to submit a new cluster template to Cloudera Altus Director and use an existing Cloudera Manager installation?
I had a Director template for Cluster A and CM A and it was deployed sucessfully. As the Director server was not needed any more, it was deleted (so the operations is managed by CM, but no changes were made in cluster config).
Now I would like to create Cluster B also with Director, but to point to an existing Cloudera Manager A (and thus using the same KDC)
Is this supported?
Thanks
... View more
Labels:
- Labels:
-
Cloudera Manager
01-22-2019
12:13 AM
1 Kudo
Annoying bug, but very simple solution, made a mistake in KrbHostFQDN. That should be the impalad fqdn.
... View more
01-18-2019
05:00 AM
Hi,
I have a good old GSS initiate failed Keberos error message when my application (JAVA) tries to connect to Impala via JDBC. I tried to eliminate all the usuall root causes but this time I think I missed something, because it still not connects:
Cannot connect: connection refused: Java::JavaSql::SQLException: [Cloudera][ImpalaJDBCDriver](500164) Error initialized or created transport for authentication: [Cloudera][ImpalaJDBCDriver](500169) Unable to connect to server: GSS initiate failed.
The settings are quite standard, MIT Kerberos, kerberized Impala with SSL, here are the params for the JDBC:
-> KDC host: ip-10-85-150-11.eu-west-1.compute.internal
-> ImpaladD host: ip-10-85-150-6.eu-west-1.compute.internal
JDBC params:
-> Host: ip-10-85-150-6.eu-west-1.compute.internal
-> Port: 21050
-> Additional params in URL: KrbHostFQDN=ip-10-85-150-11.eu-west-1.compute.internal;KrbRealm=HADOOP.DEV.REALM.LOCALL;KrbServiceName=impala;SSL=1;CAIssuedCertNamesMismatch=1;AuthMech=1;LogLevel=6;AllowSelfSignedCerts=1;SSLTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks;SSLTrustStorePwd=xxx;LogPath=/tmp/jdbc.log
The user running the app has a valid ticket:
-> Klist
[myapp@ip-10-85-150-42 ~]$ klist
Ticket cache: FILE:/tmp/krb5cc_1004
Default principal: myapp/ip-10-85-150-42.eu-west-1.compute.internal@HADOOP.DEV.REALM.LOCALL
Valid starting Expires Service principal
01/18/2019 09:31:21 01/19/2019 09:31:21 krbtgt/HADOOP.DEV.REALM.LOCALL@HADOOP.DEV.REALM.LOCALL
renew until 01/25/2019 09:31:21
I verified the params are correctly passed to the JDBC, as it can be checked in the JDBC debug log:
Jan 18 13:33:35.428 TRACE 171 com.cloudera.impala.jdbc.common.CommonCoreUtils.logConnectionFunctionEntrance({
AllowSelfSignedCerts=Variant[type: TYPE_WSTRING, value: 1],
AuthMech=Variant[type: TYPE_WSTRING, value: 1],
CAIssuedCertNamesMismatch=Variant[type: TYPE_WSTRING, value: 1],
ConnSchema=Variant[type: TYPE_WSTRING, value: analytics],
DatabaseType=Variant[type: TYPE_WSTRING, value: Impala],
HiveServerType=Variant[type: TYPE_WSTRING, value: 2],
Host=Variant[type: TYPE_WSTRING, value: ip-10-85-150-6.eu-west-1.compute.internal],
KrbHostFQDN=Variant[type: TYPE_WSTRING, value: ip-10-85-150-11.eu-west-1.compute.internal],
KrbRealm=Variant[type: TYPE_WSTRING, value: HADOOP.DEV.REALM.LOCALL],
KrbServiceName=Variant[type: TYPE_WSTRING, value: impala],
LogLevel=Variant[type: TYPE_WSTRING, value: 6],
LogPath=Variant[type: TYPE_WSTRING, value: /tmp/jdbc.log],
Port=Variant[type: TYPE_WSTRING, value: 21050],
SSL=Variant[type: TYPE_WSTRING, value: 1],
SSLTrustStore=Variant[type: TYPE_WSTRING, value: /var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks],
SSLTrustStorePwd=Variant[type: TYPE_WSTRING, value: xxxxxxxxxx],
UseNativeQuery=Variant[type: TYPE_WSTRING, value: 1]},
"Major Version: 2", "Minor Version: 6", "Hot Fix Version: 4", "Build Number: 1005", "java.vendor:Oracle Corporation", "java.version:1.8.0_191", "os.arch:amd64",
"os.name:Linux", "os.version:3.10.0-862.14.4.el7.x86_64", "Runtime.totalMemory:2097152000", "Runtime.maxMemory:2097152000", "Runtime.avaialableProcessors:4",
URLClassLoader.getURLs(): /home/myapp/myapp/myapp.jar): +++++ enter +++++
Jan 18 13:33:35.429 TRACE 171 com.cloudera.impala.dsi.core.impl.DSIConnection.getProperty(170): +++++ enter +++++
Jan 18 13:33:35.431 DEBUG 171 com.cloudera.impala.hivecommon.core.HiveJDBCCommonConnection.establishConnection: socketTimeout = 0, loginTimeout = 0
Jan 18 13:33:35.432 DEBUG 171 com.cloudera.impala.hivecommon.core.HiveJDBCCommonConnection.establishConnection: SocketTimeout is: 0 seconds for test
Jan 18 13:33:35.434 TRACE 171 com.cloudera.impala.jdbc.kerberos.Kerberos.getSubjectViaAccessControlContext(): +++++ enter +++++
Jan 18 13:33:35.436 TRACE 171 com.cloudera.impala.jdbc.kerberos.Kerberos.getSubjectViaJAASConfig(): +++++ enter +++++
Jan 18 13:33:35.437 DEBUG 171 com.cloudera.impala.jdbc.kerberos.Kerberos.getSubjectViaJAASConfig: System.getProperty(java.security.auth.login.config): /home/myapp/myapp/gss-jaas.conf
Jan 18 13:33:35.440 DEBUG 171 com.cloudera.impala.hivecommon.api.HiveServer2ClientFactory.createTransport: Kerberos subject retrieved via JAAS config
Jan 18 13:33:35.572 ERROR 171 com.cloudera.impala.exceptions.ExceptionConverter.toSQLException: [Cloudera][ImpalaJDBCDriver](500164) Error initialized or created transport for authentication: [Cloudera][ImpalaJDBCDriver](500169) Unable to connect to server: GSS initiate failed.
java.sql.SQLException: [Cloudera][ImpalaJDBCDriver](500164) Error initialized or created transport for authentication: [Cloudera][ImpalaJDBCDriver](500169) Unable to connect to server: GSS initiate failed.
The params for the Java app:
[myapp@ip-10-85-150-42 ~]$ cat myapp/gss-jaas.conf
Client {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true
doNotPrompt=true
debug=true;
};
The JAVA args:
-Djava.security.auth.login.config=/home/myapp/myapp/gss-jaas.conf -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.krb5.conf=/etc/krb5.conf -Dsun.security.jgss.debug=true -Dsun.security.krb5.debug=true
And here is the issue - KDC log:
-> KDC log:
Jan 18 13:33:34 ip-10-85-150-11.eu-west-1.compute.internal krb5kdc[13372](info): TGS_REQ (2 etypes {18 17}) 10.85.150.42: LOOKING_UP_SERVER: authtime 0, myapp/ip-10-85-150-42.eu-west-1.compute.internal@HADOOP.DEV.REALM.LOCALL for impala/ip-10-85-150-11.eu-west-1.compute.internal@HADOOP.DEV.REALM.LOCALL, Server not found in Kerberos database
Jan 18 13:33:34 ip-10-85-150-11.eu-west-1.compute.internal krb5kdc[13372](info): TGS_REQ (2 etypes {18 17}) 10.85.150.42: LOOKING_UP_SERVER: authtime 0, myapp/ip-10-85-150-42.eu-west-1.compute.internal@HADOOP.DEV.REALM.LOCALL for impala/ip-10-85-150-11.eu-west-1.compute.internal@HADOOP.DEV.REALM.LOCALL, Server not found in Kerberos database
Jan 18 13:33:35 ip-10-85-150-11.eu-west-1.compute.internal krb5kdc[13372](info): TGS_REQ (2 etypes {18 17}) 10.85.150.42: LOOKING_UP_SERVER: authtime 0, myapp/ip-10-85-150-42.eu-west-1.compute.internal@HADOOP.DEV.REALM.LOCALL for impala/ip-10-85-150-11.eu-west-1.compute.internal@HADOOP.DEV.REALM.LOCALL, Server not found in Kerberos database
During the TGS_REQ the KDC is trying to look up for impala/<KDCHOST> instead of impala/<IMPALADHOST> principal.
I double checked everything, tried variious versions of JDBC driver, but the result is the same.
Any hints would be welcome,
Thanks
... View more
Labels:
- Labels:
-
Apache Impala
-
Kerberos