it seems, in docker pyspark (2.3.0) shell in local-client mode is working and able to connect to hive. However, issuing spark-submit with all dependencies it fails with below error:
20/08/2414:03:01 INFO storage.BlockManagerMasterEndpoint:Registering block manager test.server.com:41697with6.2 GB RAM,BlockManagerId(3, test.server.com,41697,None)20/08/2414:03:02 INFO hive.HiveUtils:InitializingHiveMetastoreConnection version 1.2.1 using Spark classes.20/08/2414:03:02 INFO hive.metastore:Trying to connect to metastore with URI thrift://metastore.server.com:908320/08/2414:03:02 ERROR transport.TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException:No valid credentials provided (Mechanism level:Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
Running a simple pi example on spark-submit yarn-cluster mode through pysparkscript works fine with no kerberos issues, but when trying to access hive metastore getting kerberos error.
Can anyone throw some light how to fix this?the keytab is fine because hadoop can be accessed from the docker terminal. Isnt kerberos tickets managed automatically by yarn? I tried passing keytab and principal but it did not help. What seems to be the issue here?