Reply
Explorer
Posts: 12
Registered: ‎03-12-2018

Hbase Replication between two secure cluster can not work?

I hive two sercure hbase cluster , and I have followed all the instructions provided at https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_sg_hbase_secure_replication.html

but it can't work

 

I hive two kerberos realm , Soure hbase use QINRC_REALM.COM and target hbase use e3base_kfapp

I had configured /etc/krb5.conf on both realms

 

[realms]
QINRC_REALM.COM = {
 kdc = xardc2:21732
 admin_server = xardc2:749
 supported_enctypes = arcfour-hmac:normal
 default_domain = qinrc_realm.com
}

e3base_kfapp = {
 kdc = kf-app82:88
 admin_server = KF-APP82:749
 supported_enctypes = arcfour-hmac:normal
 default_domain = e3base_kfapp
}

 

 

[domain_realm]
qinrc_realm.com = QINRC_REALM.COM
.qinrc_realm.com = QINRC_REALM.COM
.e3base_kfapp = e3base_kfapp
 e3base_kfapp = e3base_kfapp

 

[capaths]
QINRC_REALM.COM ={
    e3base_kfapp = .
}
 
[capaths]
e3base_kfapp ={
    QINRC_REALM.COM = .
}

 

and addprinc on both kdc

 

kadmin.local  -q 'addprinc -e "arcfour-hmac:normal" krbtgt/QINRC_REALM.COM@e3base_kfapp'
kadmin.local  -q 'addprinc -e "arcfour-hmac:normal" krbtgt/e3base_kfapp@QINRC_REALM.COM'

 

 

and on my zookeeper  conffile java.env

I configure

-Dzookeeper.security.auth_to_local=RULE:[2:\$1@\$0](.*@\\Qe3base_kfapp\\E$)s/@\\Qe3base_kfapp\\E$//RULE:[1:\$1@\$0](.*@\\Qe3base_kfapp\\E$)s/@\\Qe3base_kfapp\\E$//DEFAULT
on QINRC_REALM.COM cluster

 

configure

-Dzookeeper.security.auth_to_local=RULE:[2:\$1@\$0](.*@\\QQINRC_REALM.COM\\E$)s/@\\QQINRC_REALM.COM\\E$//RULE:[1:\$1@\$0](.*@\\QQINRC_REALM.COM\\E$)s/@\\QQINRC_REALM.COM\\E$//DEFAULT

on e3base_kfapp cluster

 

and in core-site.xml

I configure

<property>
  <name>hadoop.security.auth_to_local</name>
  <value>
    RULE:[1:$1@$0](.*@\Qe3base_kfapp\E$)s/@\Qe3base_kfapp\E$//
    RULE:[2:$1@$0](.*@\Qe3base_kfapp\E$)s/@\Qe3base_kfapp\E$//
  DEFAULT
  </value>
</property>

on QINRC_REALM.COM cluster

 

 

I configure

<property>
  <name>hadoop.security.auth_to_local</name>
  <value>
    RULE:[1:$1@$0](.*@\QQINRC_REALM.COM\E$)s/@\QQINRC_REALM.COM\E$//
    RULE:[2:$1@$0](.*@\QQINRC_REALM.COM\E$)s/@\QQINRC_REALM.COM\E$//
    DEFAULT
  </value>
</property>

on e3base_kfapp cluster

 

 

now on realm QINRC_REALM.COM I can kinit user@e3base_kfapp

or i can kinit user@QINRC_REALM.COM and kvno user@e3base_kfapp

 

I think cross-realm is OK

 

 

after restart cluster

use hbase shell

I execute commond

hbase shell > add_peer '1', "kfapp74,kfapp75,kf-app82:11001:/hbase"

 

or

hbase shell > add_peer '1', CLUSTER_KEY => 'kfapp74,kfapp75,kf-app82:11001:/hbase',
                      CONFIG => {'hbase.master.kerberos.principal'           => 'e3base/kfapp74@e3base_kfapp',
            'hbase.regionserver.kerberos.principal'     => 'e3base/kfapp74@e3base_kfapp',
            'hbase.regionserver.keytab.file'            => '/e3base/qinrc/e3base.keytab',
            'hbase.master.keytab.file'                  => '/e3base/qinrc/e3base.keytab',
                }

 

 

regionservers' log print:

2018-05-13 15:34:33,755 ERROR [main.replicationSource,1-SendThread(kfapp74:11001)] client.ZooKeeperSaslClient: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. This may be caused by Java's being unable to resolve the Zookeeper Quorum Member's hostname correctly. You may want to try to adding '-Dsun.net.spi.nameservice.provider.1=dns,sun' to your client's JVMFLAGS environment. Zookeeper Client will go to AUTH_FAILED state.

2018-05-13 15:34:42,724 ERROR [main-EventThread.replicationSource,1] zookeeper.ZooKeeperWatcher: connection to cluster: 1-0x163570d2a080041, quorum=kfapp74:11001,kfapp75:11001,kf-app82:11001, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /hbase/hbaseid
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:220)
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:419)
        at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
        at org.apache.hadoop.hbase.zookeeper.ZKClusterId.getUUIDForCluster(ZKClusterId.java:96)
        at org.apache.hadoop.hbase.replication.HBaseReplicationEndpoint.getPeerUUID(HBaseReplicationEndpoint.java:105)
        at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:274)

 

 

 

 

 

 

why ??? why cross-realm trust not work on hbase replication ?

 

Can anyone help me? thanks a lot

 

 

 

 

 

 

 

 

 

Posts: 474
Topics: 14
Kudos: 77
Solutions: 41
Registered: ‎09-02-2016

Re: Hbase Replication between two secure cluster can not work?

@Mobula

 

Run the command kinit and give the required password. Also please run the klist command and make sure you have a valid ticket. Run those commands before you login to hbase shell and try again

Explorer
Posts: 12
Registered: ‎03-12-2018

Re: Hbase Replication between two secure cluster can not work?

This not work , I tried .

 in realm QINRC_REALM.COM, I found that I added princs which had the same name of realm e3base_kfapp.

such as

in realm QINRC_REALM.COM:  e3base/kfapp74@QINRC_REALM.COM

in realm e3base_kfapp             :  e3base/kfapp74@e3base_kfapp

so GSS auth failed

 

I delete these princs  in QINRC_REALM.COM, then i tried add_peer again ,

 

The regionsever log :

 

2018-05-14 08:54:54,191 INFO  [main-EventThread.replicationSource,1] zookeeper.RecoverableZooKeeper: Process identifier=connection to cluster: 1 connecting to ZooKeeper ensemble=kfapp74:11001,kfapp75:11001,kf-app82:11001
2018-05-14 08:54:54,191 INFO  [main-EventThread.replicationSource,1] zookeeper.ZooKeeper: Initiating client connection, connectString=kfapp74:11001,kfapp75:11001,kf-app82:11001 sessionTimeout=1200000 watcher=connection to cluster: 10x0, quorum=kfapp74:11001,kfapp75:11001,kf-app82:11001, baseZNode=/hbase
2018-05-14 08:54:54,193 INFO  [main.replicationSource,1-SendThread(kfapp74:11001)] client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism.
2018-05-14 08:54:54,195 INFO  [main.replicationSource,1-SendThread(kfapp74:11001)] zookeeper.ClientCnxn: Opening socket connection to server kfapp74/172.21.3.74:11001. Will attempt to SASL-authenticate using Login Context section 'Client'
2018-05-14 08:54:54,199 INFO  [main.replicationSource,1-SendThread(kfapp74:11001)] zookeeper.ClientCnxn: Socket connection established, initiating session, client: /172.21.3.66:58509, server: kfapp74/172.21.3.74:11001
2018-05-14 08:54:54,206 INFO  [main.replicationSource,1-SendThread(kfapp74:11001)] zookeeper.ClientCnxn: Session establishment complete on server kfapp74/172.21.3.74:11001, sessionid = 0x163570d2a080050, negotiated timeout = 1200000
2018-05-14 08:54:54,213 ERROR [main.replicationSource,1-SendThread(kfapp74:11001)] client.ZooKeeperSaslClient: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. This may be caused by Java's being unable to resolve the Zookeeper Quorum Member's hostname correctly. You may want to try to adding '-Dsun.net.spi.nameservice.provider.1=dns,sun' to your client's JVMFLAGS environment. Zookeeper Client will go to AUTH_FAILED state.
2018-05-14 08:54:54,213 ERROR [main.replicationSource,1-SendThread(kfapp74:11001)] zookeeper.ClientCnxn: SASL authentication with Zookeeper Quorum member failed: javax.security.sasl.SaslException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. This may be caused by Java's being unable to resolve the Zookeeper Quorum Member's hostname correctly. You may want to try to adding '-Dsun.net.spi.nameservice.provider.1=dns,sun' to your client's JVMFLAGS environment. Zookeeper Client will go to AUTH_FAILED state.
2018-05-14 08:54:55,194 WARN  [main-EventThread.replicationSource,1] zookeeper.ZKUtil: connection to cluster: 1-0x163570d2a080050, quorum=kfapp74:11001,kfapp75:11001,kf-app82:11001, baseZNode=/hbase Unable to set watcher on znode (/hbase/hbaseid)
org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /hbase/hbaseid
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:220)
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:419)
        at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
        at org.apache.hadoop.hbase.zookeeper.ZKClusterId.getUUIDForCluster(ZKClusterId.java:96)
        at org.apache.hadoop.hbase.replication.HBaseReplicationEndpoint.getPeerUUID(HBaseReplicationEndpoint.java:105)
        at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:274)
2018-05-14 08:54:55,196 ERROR [main-EventThread.replicationSource,1] zookeeper.ZooKeeperWatcher: connection to cluster: 1-0x163570d2a080050, quorum=kfapp74:11001,kfapp75:11001,kf-app82:11001, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /hbase/hbaseid
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:220)
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:419)
        at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
        at org.apache.hadoop.hbase.zookeeper.ZKClusterId.getUUIDForCluster(ZKClusterId.java:96)
        at org.apache.hadoop.hbase.replication.HBaseReplicationEndpoint.getPeerUUID(HBaseReplicationEndpoint.java:105)
        at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:274)
2018-05-14 08:54:55,196 WARN  [main-EventThread.replicationSource,1] replication.HBaseReplicationEndpoint: Lost the ZooKeeper connection for peer kfapp74,kfapp75,kf-app82:11001:/hbase
org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /hbase/hbaseid
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:220)
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:419)
        at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
        at org.apache.hadoop.hbase.zookeeper.ZKClusterId.getUUIDForCluster(ZKClusterId.java:96)
        at org.apache.hadoop.hbase.replication.HBaseReplicationEndpoint.getPeerUUID(HBaseReplicationEndpoint.java:105)
        at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:274)
2018-05-14 08:54:55,196 INFO  [main-EventThread.replicationSource,1] zookeeper.RecoverableZooKeeper: Process identifier=connection to cluster: 1 connecting to ZooKeeper ensemble=kfapp74:11001,kfapp75:11001,kf-app82:11001

 

 

 

It seems like regionserver create a zookeeper client thread , and this zk client attempt to connect target hbase cluster's zookeeper .

It take principle of QINRC_REALM.COM to find zk Server  which belongs to realm e3base_kfapp , so it can not find the server .

 

I do not know why

 

Announcements