Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Kerberos: Failure to initialize security context

avatar
Expert Contributor

Kerberos keeps gettings stuck in a loop. It gets stuck Here: (debugging is turned on)

[spark_remote@ip-192.168.1.100 ~]$ kinit -f -p -kt  spark_remote.keytab spark_remote/this.server.fqdn@MYREALM.INTERNAL
[spark_remote@ip-192.168.1.100 ~]$ spark-submit --master yarn kerberostest_2.11-1.0.jar /etc/krb5.conf spark_remote/this.server.fqdn@MYREALM.INTERNAL spark_remote.keytab
18/01/30 20:50:43 DEBUG UserGroupInformation: hadoop login
18/01/30 20:50:43 DEBUG UserGroupInformation: hadoop login commit
18/01/30 20:50:43 DEBUG UserGroupInformation: using kerberos user:spark_remote/this.server.fqdn@MYREALM.INTERNAL
18/01/30 20:50:43 DEBUG UserGroupInformation: Using user: "spark_remote/this.server.fqdn@MYREALM.INTERNAL" with name spark_remote/this.server.fqdn@MYREALM.INTERNAL
18/01/30 20:50:43 DEBUG UserGroupInformation: User entry: "spark_remote/this.server.fqdn@MYREALM.INTERNAL"
18/01/30 20:50:43 INFO UserGroupInformation: Login successful for user spark_remote/this.server.fqdn@MYREALM.INTERNAL using keytab file spark_remote.keytab
18/01/30 20:50:44 INFO SparkContext: Running Spark version 2.2.1
18/01/30 20:50:44 INFO SparkContext: Submitted application: TestKerberos
18/01/30 20:50:44 INFO SecurityManager: Changing view acls to: spark_remote
18/01/30 20:50:44 INFO SecurityManager: Changing modify acls to: spark_remote
18/01/30 20:50:44 INFO SecurityManager: Changing view acls groups to:
18/01/30 20:50:44 INFO SecurityManager: Changing modify acls groups to:
18/01/30 20:50:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(spark_remote); groups with view permissions: Set(); users  with modify permissions: Set(spark_remote); groups with modify permissions: Set()
18/01/30 20:50:44 INFO Utils: Successfully started service 'sparkDriver' on port 45523.
18/01/30 20:50:44 INFO SparkEnv: Registering MapOutputTracker
18/01/30 20:50:44 INFO SparkEnv: Registering BlockManagerMaster
18/01/30 20:50:44 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/01/30 20:50:44 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/01/30 20:50:44 INFO DiskBlockManager: Created local directory at /mnt/tmp/blockmgr-6d9a3c56-e5bc-4f55-9a69-505f2bf6540d
18/01/30 20:50:44 INFO MemoryStore: MemoryStore started with capacity 414.4 MB
18/01/30 20:50:44 INFO SparkEnv: Registering OutputCommitCoordinator
18/01/30 20:50:45 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/01/30 20:50:45 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://this.server.fqdn:4040
18/01/30 20:50:45 INFO SparkContext: Added JAR file:/home/spark_remote/kerberostest_2.11-1.0.jar at spark://192.168.1.100:45523/jars/kerberostest_2.11-1.0.jar with timestamp 1517345445384
18/01/30 20:50:45 INFO Utils: Using initial executors = 0, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
18/01/30 20:50:46 INFO Client: Attempting to login to the Kerberos using principal: spark_remote/this.server.fqdn@MYREALM.INTERNAL and keytab: spark_remote.keytab
18/01/30 20:50:46 INFO RMProxy: Connecting to ResourceManager at this.server.fqdn/192.168.1.100:8032
18/01/30 20:50:46 DEBUG UserGroupInformation: PrivilegedAction as:spark_remote/this.server.fqdn@MYREALM.INTERNAL (auth:KERBEROS) from:org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:136)
18/01/30 20:50:47 DEBUG UserGroupInformation: PrivilegedAction as:spark_remote/this.server.fqdn@MYREALM.INTERNAL (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:725)
18/01/30 20:50:47 DEBUG UserGroupInformation: PrivilegedActionException as:spark_remote/this.server.fqdn@MYREALM.INTERNAL (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): Failure to initialize security context
18/01/30 20:50:47 DEBUG UserGroupInformation: PrivilegedAction as:spark_remote/this.server.fqdn@MYREALM.INTERNAL (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:650)
18/01/30 20:50:47 DEBUG UserGroupInformation: Found tgt Ticket (hex) = 
0000: 61 82 01 93 30 82 01 8F   A0 03 02 01 05 A1 17 1B  a...0...........
0010: 15 44 41 54 41 50 41 53   53 50 4F 52 54 2E 49 4E  .MYREALM.IN
0020: 54 45 52 4E 41 4C A2 2A   30 28 A0 03 02 01 02 A1  TERNAL.*0(......
0030: 21 30 1F 1B 06 6B 72 62   74 67 74 1B 15 44 41 54  !0...krbtgt.aMYR
0040: 41 50 41 53 53 50 4F 52   54 2E 49 4E 54 45 52 4E  MYREALM>>.INTERNL
0050: 41 4C A3 82 01 41 30 82   01 3D A0 03 02 01 12 A1  AL...A0..=......
0060: 03 02 01 02 A2 82 01 2F   04 82 01 2B A2 05 25 CA  ......./...+..%.
0070: A1 82 EA 93 2B AF 43 86   9E A7 94 20 CA D9 B8 C0  ....+.C.... ....
0080: E0 1E 22 D5 4E 73 69 DB   8A 3A 39 08 71 8F 32 C2  ..".Nsi..:9.q.2.
0090: 68 18 DD F4 A0 B2 21 F7   A5 9A 6B 5B 1A E5 FA 1E  h.....!...k[....
00A0: C5 F6 13 E7 17 36 2F 74   EA 0C 12 76 82 63 09 62  .....6/t...v.c.b
00B0: 15 95 61 BF 1E 35 79 B5   82 CF 90 9A 57 B8 6F F7  ..a..5y.....W.o.
00C0: 7B EE 20 7E 87 F3 A9 10   ED 93 79 F2 D2 AE 6B 39  .. .......y...k9
00D0: D9 CD 9D 9D 51 2E BC 98   C0 4D 8F 2F C5 7F B3 2E  ....Q....M./....
00E0: 36 B6 A3 D9 E4 D5 B7 B6   FA AF 56 4A F0 9B 2D B1  6.........VJ..-.
00F0: 24 70 2A DF E9 88 0C F6   1C 9D 9A 66 42 77 42 95  $p*........fBwB.
0100: B2 0B B3 7C DE 95 93 56   E7 CB A0 67 FB 5E 45 4E  .......V...g.^EN
0110: 18 D8 75 91 94 10 23 42   9F BA 15 D3 23 B1 85 4D  ..u...#B....#..M
0120: 10 AF 1F 48 12 96 D9 06   EA 2C 34 5C DA F7 4C 1A  ...H.....,4\..L.
0130: DC 86 B4 23 57 45 34 BE   90 FE B8 33 84 15 94 70  ...#WE4....3...p
0140: 72 04 8E E7 F0 DD 90 DA   41 F6 30 73 CF 80 79 F8  r.......A.0s..y.
0150: E7 E4 D9 4C C3 AD 6A B3   F3 AD 85 01 B0 4E 65 EF  ...L..j......Ne.
0160: 4D EF 75 1B FA 0C D6 7C   01 CE 97 23 D5 FD 70 C0  M.u........#..p.
0170: 1F 8C B3 C6 1A 54 DD 13   3D 07 46 EC 83 D4 00 C4  .....T..=.F.....
0180: 57 EF 56 30 F7 AF 1B 08   98 C7 D9 85 12 32 00 8D  W.V0.........2..
0190: 21 B1 09 75 41 59 57                               !..uAYW


Client Principal = spark_remote/this.server.fqdn@MYREALM.INTERNAL
Server Principal = krbtgt/MYREALM.INTERNAL@MYREALM.INTERNAL
Session Key = EncryptionKey: keyType=18 keyBytes (hex dump)=
0000: A8 C3 93 72 3A 9B C2 4E   4E 99 CA 84 70 F3 EB 36  ...r:..NN...p..6
0010: B5 15 7B BE 22 7F EB 30   E6 DD F4 22 D6 D1 82 38  ...."..0..."...8




Forwardable Ticket true
Forwarded Ticket false
Proxiable Ticket false
Proxy Ticket false
Postdated Ticket false
Renewable Ticket false
Initial Ticket false
Auth Time = Tue Jan 30 20:50:43 UTC 2018
Start Time = Tue Jan 30 20:50:43 UTC 2018
End Time = Tue Jan 30 21:05:43 UTC 2018
Renew Till = null
Client Addresses  Null
18/01/30 20:50:48 DEBUG UserGroupInformation: PrivilegedAction as:spark_remote/this.server.fqdn@MYREALM.INTERNAL (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:725)
18/01/30 20:50:48 DEBUG UserGroupInformation: PrivilegedActionException as:spark_remote/this.server.fqdn@MYREALM.INTERNAL (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): Failure to initialize security context
18/01/30 20:50:48 DEBUG UserGroupInformation: PrivilegedAction as:spark_remote/this.server.fqdn@MYREALM.INTERNAL (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:650)
18/01/30 20:50:48 DEBUG UserGroupInformation: Found tgt Ticket (hex) =
0000: 61 82 01 93 30 82 01 8F   A0 03 02 01 05 A1 17 1B  a...0...........
...(and rince and repeat)

my k5b5.conf:

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


[libdefaults]
    default_realm = MYREALM.INTERNAL
    dns_lookup_realm = false
    dns_lookup_kdc = false
    rdns = true
    ticket_lifetime = 24h
    forwardable = true
    udp_preference_limit = 1000000
    default_tgs_enctypes = aes256-cts aes128-cts
    default_tkt_enctypes = aes256-cts aes128-cts
    permitted_enctypes = aes256-cts aes128-cts


    udp_preference_limit = 1


[realms]
    MYREALM.INTERNAL = {
        kdc = kdc.server.internal:88
        admin_server = kdc.server.internal:749
        default_domain = mydomain.internal
    }


[domain_realm]
    thismachine.fqdn = MYREALM.INTERNAL
    .us-west-2.compute.internal = MYREALM.INTERNAL
     us-west-2.compute.internal = MYREALM.INTERNAL
     .otherdomain.internal = MYREALM.INTERNAL
     otherdomain.internal = MYREALM.INTERNAL
    .mydomain.internal = MYREALM.INTERNAL
     mydomain.internal = MYREALM.INTERNAL
[logging]
    kdc = FILE:/var/log/kerberos/krb5kdc.log
    admin_server = FILE:/var/log/kerberos/kadmin.log
    default = FILE:/var/log/kerberos/krb5lib.log


You'll notice the error message is not very clear about what it's unhappy about it just says:

Failure to initialize security context

From kdc server /var/log/krb5kdc.log

Jan 30 15:42:32 kdc.server.internal krb5kdc[9279](info): AS_REQ (2 etypes {18 17}) 192.168.1.100: NEEDED_PREAUTH: spark_remote/this.server.fqdn@MYREALM.INTERNAL for krbtgt/MYREALM.INTERNAL@MYREALM.INTERNAL, Additional pre-authentication required
Jan 30 15:42:32 kdc.server.internal krb5kdc[9279](info): AS_REQ (2 etypes {18 17}) 192.168.1.100: ISSUE: authtime 1517344952, etypes {rep=18 tkt=18 ses=18}, spark_remote/this.server.fqdn@MYREALM.INTERNAL for krbtgt/MYREALM.INTERNAL@MYREALM.INTERNAL
Jan 30 15:42:37 kdc.server.internal krb5kdc[9279](info): AS_REQ (2 etypes {18 17}) 192.168.1.100: NEEDED_PREAUTH: spark_remote/this.server.fqdn@MYREALM.INTERNAL for krbtgt/MYREALM.INTERNAL@MYREALM.INTERNAL, Additional pre-authentication required
Jan 30 15:42:37 kdc.server.internal krb5kdc[9279](info): AS_REQ (2 etypes {18 17}) 192.168.1.100: ISSUE: authtime 1517344957, etypes {rep=18 tkt=18 ses=18}, spark_remote/this.server.fqdn@MYREALM.INTERNAL for krbtgt/MYREALM.INTERNAL@MYREALM.INTERNAL
Jan 30 15:42:41 kdc.server.internal krb5kdc[9279](info): TGS_REQ (2 etypes {18 17}) 192.168.1.100: ISSUE: authtime 1517344957, etypes {rep=18 tkt=18 ses=18}, spark_remote/this.server.fqdn@MYREALM.INTERNAL for yarn/this.server.fqdn@MYREALM.INTERNAL
Jan 30 15:50:38 kdc.server.internal krb5kdc[9279](info): AS_REQ (2 etypes {18 17}) 192.168.1.100: NEEDED_PREAUTH: spark_remote/this.server.fqdn@MYREALM.INTERNAL for krbtgt/MYREALM.INTERNAL@MYREALM.INTERNAL, Additional pre-authentication required
Jan 30 15:50:38 kdc.server.internal krb5kdc[9279](info): AS_REQ (2 etypes {18 17}) 192.168.1.100: ISSUE: authtime 1517345438, etypes {rep=18 tkt=18 ses=18}, spark_remote/this.server.fqdn@MYREALM.INTERNAL for krbtgt/MYREALM.INTERNAL@MYREALM.INTERNAL
Jan 30 15:50:43 kdc.server.internal krb5kdc[9279](info): AS_REQ (2 etypes {18 17}) 192.168.1.100: NEEDED_PREAUTH: spark_remote/this.server.fqdn@MYREALM.INTERNAL for krbtgt/MYREALM.INTERNAL@MYREALM.INTERNAL, Additional pre-authentication required
Jan 30 15:50:43 kdc.server.internal krb5kdc[9279](info): AS_REQ (2 etypes {18 17}) 192.168.1.100: ISSUE: authtime 1517345443, etypes {rep=18 tkt=18 ses=18}, spark_remote/this.server.fqdn@MYREALM.INTERNAL for krbtgt/MYREALM.INTERNAL@MYREALM.INTERNAL
Jan 30 15:50:47 kdc.server.internal krb5kdc[9279](info): TGS_REQ (2 etypes {18 17}) 192.168.1.100: ISSUE: authtime 1517345443, etypes {rep=18 tkt=18 ses=18}, spark_remote/this.server.fqdn@MYREALM.INTERNAL for yarn/this.server.fqdn@MYREALM.INTERNAL
Jan 30 15:52:18 kdc.server.internal krb5kdc[9279](info): AS_REQ (2 etypes {18 17}) 192.168.1.100: NEEDED_PREAUTH: spark_remote/this.server.fqdn@MYREALM.INTERNAL for krbtgt/MYREALM.INTERNAL@MYREALM.INTERNAL, Additional pre-authentication required
Jan 30 15:52:18 kdc.server.internal krb5kdc[9279](info): AS_REQ (2 etypes {18 17}) 192.168.1.100: ISSUE: authtime 1517345538, etypes {rep=18 tkt=18 ses=18}, spark_remote/this.server.fqdn@MYREALM.INTERNAL for krbtgt/MYREALM.INTERNAL@MYREALM.INTERNAL
Jan 30 15:52:21 kdc.server.internal krb5kdc[9279](info): TGS_REQ (2 etypes {18 17}) 192.168.1.100: ISSUE: authtime 1517345538, etypes {rep=18 tkt=18 ses=18}, spark_remote/this.server.fqdn@MYREALM.INTERNAL for yarn/this.server.fqdn@MYREALM.INTERNAL

Any suggestions of steps to try would be appreciated.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

rebooted all services that had keytabs, and then I was able to connect. There error stopped. Thanks for the responses.

View solution in original post

6 REPLIES 6

avatar
Master Mentor

@Matt Andruff

Ensure the entries in your /etc/hosts are not pointing just to host names, they should be FQDN.

avatar
Expert Contributor

Thanks, I saw your other post @Geoffrey Shelton Okot and I did make sure the host file was empty.

avatar
Super Collaborator

@Matt Andruff your Kerberos log says that your application (spark_remote/this.server.fqdn@MYREALM.INTERNAL) was granted a service ticket for yarn/this.server.fqdn@MYREALM.INTERNAL. Looks like this ticket is not accepted by the resource manager. If your resource manager is otherwise working well with Kerberos, I really think @Geoffrey Shelton Okot is right that it is something with the names.

can you check your name resolution with the below commands and verify that they all provide the FQDN name (this.server.fqdn) and the same IP (192.168.1.100)?

nslookup this.server.fqdn
nslookup this
nslookup 192.168.1.100

avatar
Expert Contributor

@Harald Berghoff

Here's the output:

[root@ec2-user]# nslookup this
Server:         192.168.1.100
Address:        192.168.1.100#53


Non-authoritative answer:
Name:   this.server.fqdn.compute.internal
Address: 172.31.10.196


[root@ec2-user]# nslookup this.server.fqdn 
Server:         192.168.1.100
Address:        192.168.1.100#53


** server can't find this.server.fqdn: NXDOMAIN


[root@ip-172-31-10-196 ec2-user]# nslookup this
Server:         192.168.1.100
Address:        192.168.1.100#53


Non-authoritative answer:
Name:   this.server.fqdn.compute.internal
Address: 172.31.10.196


[root@ec2-user]# nslookup 192.168.1.100
Server:         192.168.1.100
Address:        192.168.1.100#53


Non-authoritative answer:
100.1.168.192.in-addr.arpa      name = ip-192.168.1.100.us-west-2.compute.internal.


Authoritative answers can be found from:

Obviously this output is obstificated... I"m happy to share the real output privately if that helps. I'm running in amazon on a EC2 cluster.

avatar
Super Collaborator

@Matt Andruff

May be you can verify the following point, in your intial log I can see this entry:

18/01/30 20:50:46 INFO RMProxy: Connecting to ResourceManager at this.server.fqdn/192.168.1.100:8032

So to me it looks like it tries to connect using the IP address. If that is true, and the reverse lookup by the IP doesn't return the name this.server.fqdn, the ticket that was granted for yarn/this.server.fqdn@MYREALM.INTERNAL can't be accepted.

And what I can see from your output this is the case (if this is a result of the obfuscation just correct me):

  • this.server.fqdn => 192.168.1.100
    ok
  • this => 192.168.1.100
    ok
  • 192.168.1.100 => ip-192.168.1.100.us-west-2.compute.internal <> this.server.fqdn
    nok, being the potential root cause of your authentication issue.

What should help in that case:

  1. Make sure your client uses the server name instead of the IP, so that the reverse lookup will not be invoked
  2. Ensure that the reverse lookup results in the name this.server.fqdn (sometimes this is not possible due to network topology)

avatar
Expert Contributor

rebooted all services that had keytabs, and then I was able to connect. There error stopped. Thanks for the responses.