Created 03-14-2016 05:20 PM
Hi all,
I'm trying to migrate to beeline from HIVE CLI. This is become troublesome for many reason, however the show stopper is the following:
beeline> !connect jdbc:hive2://somehost.company.com:10000/default;principal=user@COMPANY.COM
Kerberos principal should have 3 parts: user@COMPANY.COM
Our users are going to have only 2 part principals when they login to the Linux shell. Issuing tickets for every user for every system is not an option and creates a huge deployment nightmare.
Please let me know if anyone has found a solution to this issue.
Thanks,
Dallin
Created 03-14-2016 06:19 PM
Kerberos user principals have 2 parts (otherwise you'd be right... that would be a deployment nightmare!).
Only host-based service principals have 3 parts (the extra part being the host where the service is running). In the beeline connect string you should always use the hive service principal for the HiveServer2 instance to which you are connecting. Another option is to use _HOST instead of the specific hostname, which will be expanded to the correct host.
For example:
kinit myuser@COMPANY.COM beeline> !connect jdbc:hive2://somehost.company.com:10000/default;principal=hive/_HOST@COMPANY.COM
Created 03-14-2016 06:19 PM
Kerberos user principals have 2 parts (otherwise you'd be right... that would be a deployment nightmare!).
Only host-based service principals have 3 parts (the extra part being the host where the service is running). In the beeline connect string you should always use the hive service principal for the HiveServer2 instance to which you are connecting. Another option is to use _HOST instead of the specific hostname, which will be expanded to the correct host.
For example:
kinit myuser@COMPANY.COM beeline> !connect jdbc:hive2://somehost.company.com:10000/default;principal=hive/_HOST@COMPANY.COM
Created 03-14-2016 08:16 PM
Any idea why this parameter is required at all? Hiveserver should know which principal it is started with. Why would you have to tell it again?
Created 03-15-2016 01:54 PM
it is confusing that the principal is the hive/_HOST@COMPANY.COM is required, when hiveserver2 is already using that principal. I will test kinit with this combination and post back.
Created 03-15-2016 02:40 PM
Working. Confusing, but working. Thanks for the help.
$ klist Ticket cache: FILE:/tmp/krb5cc_1234 Default principal: user1@COMPANY.COM Valid starting Expires Service principal 03/15/16 14:08:08 03/22/16 14:08:08 krbtgt/COMPANY.COM@COMPANY.COM renew until 03/22/16 14:08:08
$ beeline -u 'jdbc:hive2://hiveserver.company.com:10000/default;principal=hive/_HOST@COMPANY.COM' scan complete in 3ms Connecting to jdbc:hive2://hiveserver.company.com:10000/default;principal=hive/_HOST@COMPANY.COM Connected to: Apache Hive (version 0.13.1.2.1.15.0-946) Driver: Hive JDBC (version 0.13.1.2.1.15.0-946) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.13.1.2.1.15.0-946 by Apache Hive 0: jdbc:hive2://hiveserver.company.com:1> use database1; No rows affected (0.019 seconds) 0: jdbc:hive2://hiveserver.company.com:1> describe table1; +-------------------------------+------------+----------+ | col_name | data_type | comment | +-------------------------------+------------+----------+ | column1 | string | | .... | column2 | string | | +-------------------------------+------------+----------+ 40 rows selected (0.144 seconds)