Created 12-16-2019 04:43 AM
Dear all,
I'm having an issue with on a kerberized HDP 3.1 ( with ranger on an Active directory 😞 the timeline service V2 never worked either with embedded or not.
We are currently trying to Configure an external hBase for Timeline Service 2.0 as describe on hdp manual ( https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/data-operating-system/content/configure_hbase_...) but the the following step failed : create the required HBase tables using the following command
export HBASE_CLASSPATH_PREFIX=/usr/hdp/current/hadoop-yarn-client/timelineservice/*; /usr/hdp/current/hbase-client/bin/hbase org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator -Dhbase.client.retries.number=35 -create -s
Before lanching the command, we "kinit" with the hbase keytab ( we also tried with yarn-ats.hbase-client one ) but we got the followings errors :
[root@MON_SERVEUR_3 hbase]# kinit -kt /etc/security/keytabs/yarn-ats.hbase-client.headless.keytab yarn-ats-datalake_prod
[root@MON_SERVEUR_3 hbase]# /usr/hdp/current/hbase-client/bin/hbase org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator -create -s SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/phoenix/phoenix-5.0.0.3.1.0.0-78-server.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2019-12-16 16:10:50,372 INFO [main] storage.TimelineSchemaCreator: Starting the schema creation
2019-12-16 16:10:50,686 INFO [main] common.HBaseTimelineStorageUtils: Using hbase configuration at file:///usr/hdp/3.1.0.0-78/hadoop/conf/embedded-yarn-ats-hbase/hbase-site.xml
2019-12-16 16:10:50,811 INFO [main] storage.TimelineSchemaCreator: Will skip existing tables and continue on htable creation exceptions!
2019-12-16 16:10:51,127 INFO [main] zookeeper.ReadOnlyZKClient: Connect 0x13b6aecc to MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181 with session timeout=90000ms, retries 6, retry interval 1000ms, keepAlive=60000ms
2019-12-16 16:10:51,143 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.6-78--1, built on 12/06/2018 12:30 GMT
2019-12-16 16:10:51,143 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:host.name=MON_SERVEUR_3.MY_DOMAIN
2019-12-16 16:10:51,143 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:java.version=1.8.0_191
2019-12-16 16:10:51,143 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
2019-12-16 16:10:51,143 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.x86_64/jre
2019-12-16 16:10:51,144 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:java.class.path=...
2019-12-16 16:10:51,152 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:java.library.path=:/usr/hdp/3.1.0.0-78/hadoop/lib/native/Linux-amd64-64:/usr/hdp/3.1.0.0-78/hadoop/lib/native/Linux-amd64-64:/usr/hdp/3.1.0.0-78/hadoop/lib/native
2019-12-16 16:10:51,153 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
2019-12-16 16:10:51,153 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2019-12-16 16:10:51,153 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:os.name=Linux
2019-12-16 16:10:51,153 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:os.arch=amd64
2019-12-16 16:10:51,153 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:os.version=3.10.0-862.el7.x86_64
2019-12-16 16:10:51,153 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:user.name=root
2019-12-16 16:10:51,153 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:user.home=/root
2019-12-16 16:10:51,153 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Client environment:user.dir=/mnt/hadoop/log/hbase
2019-12-16 16:10:51,156 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc] zookeeper.ZooKeeper: Initiating client connection, connectString=MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$13/688367151@5dddec88
2019-12-16 16:10:51,185 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc-SendThread(MON_SERVEUR_01.MY_DOMAIN:2181)] zookeeper.Login: successfully logged in.
2019-12-16 16:10:51,186 INFO [Thread-4] zookeeper.Login: TGT refresh thread started.
2019-12-16 16:10:51,190 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc-SendThread(MON_SERVEUR_01.MY_DOMAIN:2181)] client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism.
2019-12-16 16:10:51,202 INFO [Thread-4] zookeeper.Login: TGT valid starting at: Mon Dec 16 16:10:35 RET 2019
2019-12-16 16:10:51,202 INFO [Thread-4] zookeeper.Login: TGT expires: Tue Dec 17 02:10:35 RET 2019
2019-12-16 16:10:51,202 INFO [Thread-4] zookeeper.Login: TGT refresh sleeping until: Tue Dec 17 00:16:20 RET 2019
2019-12-16 16:10:51,222 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc-SendThread(MON_SERVEUR_01.MY_DOMAIN:2181)] zookeeper.ClientCnxn: Opening socket connection to server MON_SERVEUR_01.MY_DOMAIN/192.168.82.78:2181. Will attempt to SASL-authenticate using Login Context section 'Client'
2019-12-16 16:10:51,228 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc-SendThread(MON_SERVEUR_01.MY_DOMAIN:2181)] zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.82.83:58714, server: MON_SERVEUR_01.MY_DOMAIN/192.168.82.78:2181
2019-12-16 16:10:51,241 INFO [ReadOnlyZKClient-MON_SERVEUR_02.MY_DOMAIN:2181,MON_SERVEUR_01.MY_DOMAIN:2181,MON_SERVEUR_3.MY_DOMAIN:2181@0x13b6aecc-SendThread(MON_SERVEUR_01.MY_DOMAIN:2181)] zookeeper.ClientCnxn: Session establishment complete on server MON_SERVEUR_01.MY_DOMAIN/192.168.82.78:2181, sessionid = 0x16f0d3e151100d1, negotiated timeout = 60000
2019-12-16 16:10:56,131 INFO [main] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=8, started=4611 ms ago, cancelled=false, msg=Call to MY_DATANODE_07.MY_DOMAIN/192.168.82.71:16020 failed on local exception: java.io.IOException: Can not send request because relogin is in progress., details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=MY_DATANODE_07.MY_DOMAIN,16020,1576496376743, seqNum=-1
2019-12-16 16:10:57,770 WARN [Relogin-pool4-t1] security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1576498252658
2019-12-16 16:11:00,172 INFO [main] client.RpcRetryingCallerImpl: Call exception, tries=7, retries=8, started=8653 ms ago, cancelled=false, msg=Call to MY_DATANODE_07.MY_DOMAIN/192.168.82.71:16020 failed on local exception: java.io.IOException: org.apache.hbase.thirdparty.io.netty.handler.codec.DecoderException: org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS initiate failed, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=MY_DATANODE_07.MY_DOMAIN,16020,1576496376743, seqNum=-1
2019-12-16 16:11:02,657 WARN [Relogin-pool4-t1] security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1576498252658
2019-12-16 16:11:04,415 INFO [main] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=8, started=4141 ms ago, cancelled=false, msg=Call to MY_DATANODE_07.MY_DOMAIN/192.168.82.71:16020 failed on local exception: java.io.IOException: org.apache.hbase.thirdparty.io.netty.handler.codec.DecoderException: org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS initiate failed, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=MY_DATANODE_07.MY_DOMAIN,16020,1576496376743, seqNum=-1
2019-12-16 16:11:08,018 WARN [Relogin-pool4-t1] security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1576498252658
2019-12-16 16:11:08,446 INFO [main] client.RpcRetryingCallerImpl: Call exception, tries=7, retries=8, started=8172 ms ago, cancelled=false, msg=Call to MY_DATANODE_07.MY_DOMAIN/192.168.82.71:16020 failed on local exception: java.io.IOException: org.apache.hbase.thirdparty.io.netty.handler.codec.DecoderException: org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS initiate failed, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=MY_DATANODE_07.MY_DOMAIN,16020,1576496376743, seqNum=-1
2019-12-16 16:11:11,950 WARN [Relogin-pool4-t1] security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1576498252658
2019-12-16 16:11:12,790 INFO [main] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=8, started=4142 ms ago, cancelled=false, msg=Call to MY_DATANODE_07.MY_DOMAIN/192.168.82.71:16020 failed on local exception: java.io.IOException: org.apache.hbase.thirdparty.io.netty.handler.codec.DecoderException: org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS initiate failed, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=MY_DATANODE_07.MY_DOMAIN,16020,1576496376743, seqNum=-1
2019-12-16 16:11:16,744 WARN [Relogin-pool4-t1] security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1576498252658
2019-12-16 16:11:16,805 INFO [main] client.RpcRetryingCallerImpl: Call exception, tries=7, retries=8, started=8157 ms ago, cancelled=false, msg=Call to MY_DATANODE_07.MY_DOMAIN/192.168.82.71:16020 failed on local exception: java.io.IOException: org.apache.hbase.thirdparty.io.netty.handler.codec.DecoderException: org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS initiate failed, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=MY_DATANODE_07.MY_DOMAIN,16020,1576496376743, seqNum=-1
We tried to create some lambda tables and it worked.
We tried made hbase public with ranger but the script still fails with the same error.
Any idea why it didn't work ?
BR,
Created 01-09-2020 09:03 AM
I was having the same issue in a similar environment.
I did the following steps to fix it:
- Rename /usr/hdp/current/hadoop-client/conf/embedded-yarn-ats-hbase/hbase-site.xml to hbase-site.xmlBkp
- Copy /usr/hdp/current/hbase-client/conf/hbase-site.xml to /usr/hdp/current/hadoop-client/conf/embedded-yarn-ats-hbase
- Execute the following commands:
export HBASE_CLASSPATH_PREFIX=/usr/hdp/current/hadoop-yarn-client/timelineservice/*; /usr/hdp/current/hbase-client/bin/hbase org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator -Dhbase.client.retries.number=35 -create -s
- Using hbase shell, run the following commands:
grant 'yarn', 'RWXCA'
grant 'yarn-ats', 'RWXCA'
Hope it helps.
Regards
Greg
Created 01-10-2020 12:20 AM
Can you check and use the correct principal of the hbase keytab? And use that one to kinit as see if that works? To get the principal do the below and press enter
$ klist -kt /etc/security/keytabs/yarn-ats.hbase-client.headless.keytab
The output should look like
Keytab name: FILE:/etc/security/keytabs/yarn-ats.hbase-client.headless.keytab
KVNO Timestamp Principal
---- ----------------- --------------------------------------------------------
1 02/01/20 23:00:12 hbase/[FQDN].[REALM]
1 02/01/20 23:00:12 hbase/[FQDN].[REALM]
1 02/01/20 23:00:12 hbase/[FQDN].[REALM]
1 02/01/20 23:00:12 hbase/[FQDN].[REALM]
1 02/01/20 23:00:12 hbase/[FQDN].[REALM]
Then you can kinit using format kinit -kt $keytab $principal
$ kinit -kt /etc/security/keytabs/yarn-ats.hbase-client.headless.keytab hbase/[FQDN].[REALM]
The klist command should show a valid ticket
$ klist
Now proceed and revert
Created 05-17-2022 11:57 PM
@obrobecker did you get this resolved? i ran into same issue in kerberized HDP 3.1, Non-kerberized env worked fine with zookeeper parent znode = /hbase-unsecure.
@Shelton I did tried all these keytabs and their principals, yarn-ats.hbase-client.headless.keytab, yarn-ats.hbase-master.service.keytab, yarn-ats.hbase-regionserver.service.keytab, hbase.headless.keytab, hbase.service.keytab.
Non working giving same error
failed on local exception: java.io.IOException: org.apache.hbase.thirdparty.io.netty.handler.codec.DecoderException: org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS initiate failed, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=myregionhost.
Any suggestions to fix this?
Created 05-18-2022 07:01 AM
@Shelton , i ran into same issue as @obrobecker and tried you suggestion kinit with yarn-ats.hbase-client.headless.keytab, hbase.headless.keytab and their principles, still same issue not able to create hbase tables. it was working fine before kerberization. Is there anything else i can try?
Created 05-25-2022 11:28 PM
@TonyQiu
Sorry been away for a while. Creating a table in a kerberized cluster should be possible as hbase. I just tried it out see the flow below.. I didn't need to specifically kinit as the user hbase had a valid Kerberos ticket
Switch to hbase user
[root@malindi keytabs]# su - hbase
Last login: Wed May 25 21:51:56 CEST 2022
Check if kerberos ticket is valid
[hbase@malindi ~]$ klist
Ticket cache: FILE:/tmp/krb5cc_1016
Default principal: hbase-jair@KENYA.KE
Valid starting Expires Service principal
05/25/2022 21:51:57 05/26/2022 21:51:57 krbtgt/KENYA.KE@KENYA.KE
Connect to hbase shell as hbase
[hbase@malindi ~]$ hbase-shell
List existing tables
hbase(main):006:0> list
TABLE
ATLAS_ENTITY_AUDIT_EVENTS
atlas_titan
2 row(s) in 0.6660 seconds
=> ["ATLAS_ENTITY_AUDIT_EVENTS", "atlas_titan"]
Create table emp
hbase(main):007:0> create 'emp', 'personal data', 'professional data'
0 row(s) in 4.9500 seconds
=> Hbase::Table - emp
List tables validate emp was created
hbase(main):009:0> list
TABLE
ATLAS_ENTITY_AUDIT_EVENTS
atlas_titan
emp
3 row(s) in 0.0130 seconds
=> ["ATLAS_ENTITY_AUDIT_EVENTS", "atlas_titan", "emp"]
Describe table emp
hbase(main):010:0> describe 'emp'
Table emp is ENABLED
emp
COLUMN FAMILIES DESCRIPTION
{NAME => 'personal data', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS
=> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'professional data', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERS
IONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.8770 seconds
Update table
hbase(main):011:0> alter 'emp', {NAME => 'metrics', BLOCKSIZE => '16384', COMPRESSION => 'SNAPPY'}
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 4.9660 seconds
Can you share your snippet?