Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HiveServer2 with LDAP Authentication works with beeline but fails in pyspark

HiveServer2 with LDAP Authentication works with beeline but fails in pyspark

Explorer

I am trying to connect to a HiveServer2 which has LDAP authentication enabled. When logging in through beeline, I am able to connect successfully and query tables. However, if I try to connect it via pyspark it fails with "Peer indicated failure: Error validating the login" error.

The interesting finding is that, I am able use 3rd party JDBC based SQL Clients like SQLWorkbench/J or Aginity Workbench for Hadoop to connect to HiveServer2 successfully with LDAP username and password. This indicates the problem lies within Spark/PySpark, JDBC and Hive Driver.

Note: I cannot use HiveContext to connect to Hive because it has Ranger Authorization enabled. Spark does not work with Ranger when using native Hive libraries. It has to go through Hive Server (Thrift Server) via JDBC.

Pyspark Code

from pyspark.sql import SQLContext
sqlCtx = SQLContext(sc)
df = (sqlContext
    .load(source="jdbc",
          url="jdbc:hive2://thrift-server-url:10000/default?user=[ldap_user]&password=[ldap_user_password]",
          dbtable="table_name")
 )
sc.stop()
sc.stop()

Error Stack

ERROR [HiveServer2-Handler-Pool: Thread-66]: transport.TSaslTransport (TSaslTransport.java:open(315)) - SASL negotiation failure
javax.security.sasl.SaslException: Error validating the login [Caused by javax.security.sasl.AuthenticationException: LDAP Authentication failed for user [Caused by javax.naming.AuthenticationException: [LDAP: error code 49 - Invalid Credentials]]]
        at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:109)
        at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539)
        at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283)
        at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
        at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: javax.security.sasl.AuthenticationException: LDAP Authentication failed for user [Caused by javax.naming.AuthenticationException: [LDAP: error code 49 - Invalid Credentials]]
        at org.apache.hive.service.auth.LdapAuthenticationProviderImpl.Authenticate(LdapAuthenticationProviderImpl.java:185)
        at org.apache.hive.service.auth.PlainSaslHelper$PlainServerCallbackHandler.handle(PlainSaslHelper.java:106)
        at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:102)
        ... 8 more
Caused by: javax.naming.AuthenticationException: [LDAP: error code 49 - Invalid Credentials]
        at com.sun.jndi.ldap.LdapCtx.mapErrorCode(LdapCtx.java:3135)
        at com.sun.jndi.ldap.LdapCtx.processReturnCode(LdapCtx.java:3081)
        at com.sun.jndi.ldap.LdapCtx.processReturnCode(LdapCtx.java:2883)
        at com.sun.jndi.ldap.LdapCtx.connect(LdapCtx.java:2797)
        at com.sun.jndi.ldap.LdapCtx.<init>(LdapCtx.java:319)
        at com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(LdapCtxFactory.java:192)
        at com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(LdapCtxFactory.java:210)
        at com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxInstance(LdapCtxFactory.java:153)
        at com.sun.jndi.ldap.LdapCtxFactory.getInitialContext(LdapCtxFactory.java:83)
        at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:684)
        at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:313)
        at javax.naming.InitialContext.init(InitialContext.java:244)
        at javax.naming.InitialContext.<init>(InitialContext.java:216)
        at javax.naming.directory.InitialDirContext.<init>(InitialDirContext.java:101)
        at org.apache.hive.service.auth.LdapAuthenticationProviderImpl.Authenticate(LdapAuthenticationProviderImpl.java:167)
        ... 10 more
2016-11-17 21:48:20,508 ERROR [HiveServer2-Handler-Pool: Thread-66]: server.TThreadPoolServer (TThreadPoolServer.java:run(297)) - Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Error validating the login
        at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException: Error validating the login
        at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
        at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
        at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
        at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
        ... 4 more
1 REPLY 1
Highlighted

Re: HiveServer2 with LDAP Authentication works with beeline but fails in pyspark

Expert Contributor

@Jayaraman Palaniappan

Try this url:

url="jdbc:hive2://thrift-server-url:10000/default;user=[ldap_user];password=[ldap_user_password]"
Don't have an account?
Coming from Hortonworks? Activate your account here