Created 08-17-2016 07:47 PM
My Sqlserver JDBC connection works perfectly if I use SQL authentication. However, if I use JDBC connection string for the Windows authentication, it works fine if I just use sqoop list-databases, but on executing the sqoop import, it failes after initializing the map reduce job. Below is the command which I am using along with the errors. Please suggest a solution. Thanks
sqoop import --connect "jdbc:sqlserver://FQDN;integratedSecurity=true;authenticationScheme=JavaKerberos;database=database_name" --username 'Domain\user' --query 'select * from table where $CONDITIONS' --split-by Abc_Number --target-dir /user/user/Sql
INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:NNHA, Ident: (HDFS_DELEGATION_TOKEN token 2496 for ) INFO impl.YarnClientImpl: Submitted application INFO mapreduce.Job: The url to track the job: INFO mapreduce.Job: Running job: job_1 INFO mapreduce.Job: Job job_1 running in uber mode : false INFO mapreduce.Job: map 0% reduce 0% INFO mapreduce.Job: Task Id : attempt_1470416754637_0238_m_000003_0, Status : FAILED Error: java.lang.RuntimeException: java.lang.RuntimeException: com.microsoft.sqlserver.jdbc.SQLServerException: Integrated authentication failed. ClientConnectionId:366c38c7-c at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:167) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.RuntimeException: com.microsoft.sqlserver.jdbc.SQLServerException: Integrated authentication failed. ClientConnectionId:366c38c7-c at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:220) at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:165) ... 9 more Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Integrated authentication failed. ClientConnectionId:366c38c7-c at com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:2399) at com.microsoft.sqlserver.jdbc.KerbAuthentication.intAuthInit(KerbAuthentication.java:176) at com.microsoft.sqlserver.jdbc.KerbAuthentication.GenerateClientContext(KerbAuthentication.java:296) at com.microsoft.sqlserver.jdbc.SQLServerConnection.sendLogon(SQLServerConnection.java:4084) at com.microsoft.sqlserver.jdbc.SQLServerConnection.logon(SQLServerConnection.java:3159) at com.microsoft.sqlserver.jdbc.SQLServerConnection.access$100(SQLServerConnection.java:42) at com.microsoft.sqlserver.jdbc.SQLServerConnection$LogonCommand.doExecute(SQLServerConnection.java:3122) at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7505) at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2444) at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1980) at com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1627) at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1458) at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:772) at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1168) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:247) at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:302) at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:213) ... 10 more Caused by: java.security.PrivilegedActionException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at com.microsoft.sqlserver.jdbc.KerbAuthentication.getClientCredential(KerbAuthentication.java:199) at com.microsoft.sqlserver.jdbc.KerbAuthentication.intAuthInit(KerbAuthentication.java:150) ... 26 more Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) at sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:193) at sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:427) at sun.security.jgss.GSSCredentialImpl.<init>(GSSCredentialImpl.java:62) at sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:154) at com.microsoft.sqlserver.jdbc.KerbAuthentication$1.run(KerbAuthentication.java:189) at com.microsoft.sqlserver.jdbc.KerbAuthentication$1.run(KerbAuthentication.java:187) ... 30 more
Created 08-17-2016 08:19 PM
Hi @Khera
What JDBC driver are you using to connect to SQL Server? The one provided by MS does not support WIndows authentication. That said, you can grab another driver that does support it. You have a couple options:
<code>jdbc:jtds:sqlserver://123.123.123;instance=server1;databaseName=students;integratedSecurity=true;authenticationScheme=JavaKerberos
Created 08-17-2016 08:12 PM
@Khera according to this HCC post, Windows Authentication is not yet supported by Sqoop, with the JDBC driver provided by Microsoft, and SQL Authentication must be used.
Created 08-17-2016 08:32 PM
Thanks @slachterman. But how can I connect with the SQL server and get the list of databases with the same JDBC. This is the command we used to get the list of databases. Its just with the sqoop import, which is failing.
sqoop list-databases --connect "jdbc:sqlserver://FQDN;integratedSecurity=true;authenticationScheme=JavaKerberos;" --username 'Domain\user'
Created 08-17-2016 08:19 PM
Hi @Khera
What JDBC driver are you using to connect to SQL Server? The one provided by MS does not support WIndows authentication. That said, you can grab another driver that does support it. You have a couple options:
<code>jdbc:jtds:sqlserver://123.123.123;instance=server1;databaseName=students;integratedSecurity=true;authenticationScheme=JavaKerberos
Created 08-17-2016 08:31 PM
@Brandon Wilson Thanks for answering. We are using Microsoft JDBC Driver 4.2 to connect. We'll try with the ones you mentioned. But how can I connect with the SQL server and get the list of databases with the same JDBC. This is the command we used to get the list of databases. Its just with the sqoop import, which is failing.
sqoop list-databases --connect "jdbc:sqlserver://FQDN;integratedSecurity=true;authenticationScheme=JavaKerberos;" --username 'Domain\user'
Created 08-18-2016 12:09 PM
Based on the error, it seems that you do not have a valid Kerberos ticket. Is the machine you are initiating the Sqoop job from integrated with your Windows AD via Kerberos?
Created 08-18-2016 01:15 PM
Yes, it is configured with our Windows AD via kerberos. @Brandon Wilson
Created 08-18-2016 02:40 PM
Like I said, I do not believe you can connect from Linux to SQLServer using IWA with the sqlsever jdbc driver. I recommend that you drop the jTDS driver in your Sqoop lib dir and try using that driver.
Created 09-07-2016 02:48 PM
@Brandon Wilson @Khera @slachterman
Disclaimer: I work with Khera.
We are using the MS SQL JDBC connector found here: https://www.microsoft.com/en-us/download/details.aspx?id=11774
Specifically the sqljdbc42.jar file which supports Java 1.8 (We use Oracle Java 1.8 on all our Centos Nodes).
Our entire cluster is "kerberized", so all access is setup with Kerberos. All nodes are setup within Active Directory to allow Kerberos Delegation. Additionally all user accounts used to access the cluster are AD accounts, and will have valid kerberos tickets at login using one of two methods:
1. If they use a SSH client that supports GSSAPI, and requests delegation of tokens, they will have a delegated Kerberos token at login. For example:
[xxxxxx@xxxxxxx ~]$ klist -f Ticket cache: FILE:/tmp/krb5cc_478262072 Default principal: xxxxxx@xxxx Valid starting Expires Service principal 08/29/2016 12:54:12 08/29/2016 22:54:12 krbtgt/xxxx@xxx renew until 09/05/2016 12:54:12, Flags: FfRA
2. If they use an SSH client and login using a password (no GSSAPI), they will automatically be granted a Kerberos ticket. For example:
[xxxxx@xxxxx ~]$ klist -f Ticket cache: FILE:/tmp/krb5cc_478262072 Default principal: xxxxx@CORP.FOOTLOCKER.NET Valid starting Expires Service principal 09/07/2016 10:28:36 09/07/2016 20:28:36 krbtgt/xxxxx@xxxx renew until 09/14/2016 10:28:36, Flags: FRIA
A manual kinit is always available and works, but by default will not generate forwardable tickets (the F flag seen above)
So when he calls scoop like so:
sqoop list-databases --connect "jdbc:sqlserver://hostname.domain.com;integratedSecurity=true;authenticationScheme=JavaKerberos" --username 'DOMAIN\USER'
We know that it is using his Kerberos ticket to connect to the SQL server, as there is no password prompt and authentication is still required to get a list of databases (so I've been told). I do not have access to the SQL server in question, so I can't verify using the Windows event logs that Kerberos is in use, but it certainly looks like it works 🙂
However when we then try to import the data, the mapreduce job either does not have any kerberos tickets available or just not one for khera. I'm not familiar with the chain of services that is involved in the import (Scoop, then Yarn, then MapR?). Additionally our cluster is using Knox/Ranger which seems to further complicate things, as well as the fact that I don't have access to the SQL server nor the Hadoop processes involved so I can't do any troubleshooting myself.
I have a few questions, hopefully someone can answer them.
1. What is the chain of services involved? (Scoop submits job to YARN, Yarn runs Mapreduce etc?)
2. Once we know #1, do all of those services involved
a) Support Kerberos
B) Support Kerberos DELEGATION as the user
C). Forwarding the Kerberos delegated ticket to the NEXT process in the chain (or conversely requesting delegation).
Any help is appreciated.
Created 09-07-2016 07:06 PM
@Jonathan Kaufman if you are a Hortonworks customer, this would be great detailed information to submit to Support to assist with this issue within your environment.