Created 03-02-2016 09:19 AM
I need to use sqoop on linux to pull data from SQL Server running with integrated security. Can anyone confirm that they have made this work with HDP 2.3.4 and share the steps?
Created 02-10-2017 07:38 AM
@Justin MillerI too faced the same problem, finally i resolved the issue with the below mention procedure,it may help you i hope.
Sqoop SQL Server data import to HDFS worked with manual parametric the authentication(using windows credential) with added parameter on the SQL Server JDBC driver, as integrated security is not supported by the SQL driver as of now due to the Kerberos authentication(Delegated tokens distributed over cluster while running MR job).
So we need to pass the windows authentication with password and with the integrated security disabled mode to import the data to the system. As normal SQL server driver does not support, so I had used the jtds.jar and the different driver class to pull the data to the Hadoop Lake.
Sample Command I tried on the server as follows,
sqoop import --table Table1 --connect "jdbc:jtds:sqlserver://<Hostname>:<Port>;useNTLMv2=true;domain=<WindowsDomainName>;databaseName=XXXXXXXXXXXXX" \
--connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX --password 'XXXXXXX' \
--verbose --target-dir /tmp/33 -m 1 -- --schema dbo
Created 03-02-2016 09:27 AM
See this https://alexeikh.wordpress.com/2012/05/03/using-sqoop-for-moving-data-between-hadoop-and-sql-server/
You can run sqoop in linux and read sql server to get data into Hadoop
Aslo,
https://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html
For example, to connect to a SQLServer database, first download the driver from microsoft.com and install it in your Sqoop lib path.
Then run Sqoop. For example:
$ sqoop import --driver com.microsoft.jdbc.sqlserver.SQLServerDriver \ --connect <connect-string> ...
When connecting to a database using JDBC, you can optionally specify extra JDBC parameters via a property file using the option --connection-param-file
. The contents of this file are parsed as standard Java properties and passed into the driver while creating a connection.
Note | |
---|---|
The parameters specified via the optional property file are only applicable to JDBC connections. Any fastpath connectors that use connections other than JDBC will ignore these parameters. |
Created 03-02-2016 09:33 AM
Thanks Neeraj - how does this help with integrated security?
Created 08-05-2016 08:48 AM
It does not address the integrated authentication problem. Has anyone tried this one instead?
https://datamountaineer.com/2016/01/15/spark-jdbc-sql-server-kerberos/
Created 03-02-2016 10:06 AM
Created 03-02-2016 09:05 PM
@awhitter as far as I know this is not possible. You will have to use SQL Authentication. If security is a concern, you can store the password in a encrypted file and call the file in the Sqoop -import command. If you are unable to change the authentication settings on SQL Server due to corporate policies, you may need to dump the tables as files first and import them directly into Hive.
Created 03-17-2016 03:22 PM
I am also waiting the same problem with jbdc42.jar and below is the error stack -
16/03/17 15:51:39 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: com.microsoft.sqlserver.jdbc.SQLServerException: This driver is not configured for integrated authentication. java.lang.RuntimeException: com.microsoft.sqlserver.jdbc.SQLServerException: This driver is not configured for integrated authentication. at org.apache.sqoop.manager.CatalogQueryManager.listTables(CatalogQueryManager.java:118) at org.apache.sqoop.tool.ListTablesTool.run(ListTablesTool.java:49) at org.apache.sqoop.Sqoop.run(Sqoop.java:148) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235) at org.apache.sqoop.Sqoop.main(Sqoop.java:244) Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: This driver is not configured for integrated authentication. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:170) at com.microsoft.sqlserver.jdbc.SQLServerConnection.sendLogon(SQLServerConnection.java:2338) at com.microsoft.sqlserver.jdbc.SQLServerConnection.logon(SQLServerConnection.java:1929) at com.microsoft.sqlserver.jdbc.SQLServerConnection.access$000(SQLServerConnection.java:41) at com.microsoft.sqlserver.jdbc.SQLServerConnection$LogonCommand.doExecute(SQLServerConnection.java:1917) at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4026) at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:1416) at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1061) at com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:833) at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:716) at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:841) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:233) at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:883) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.CatalogQueryManager.listTables(CatalogQueryManager.java:102) ... 7 more
Can some one please help ?
Created 03-29-2016 04:39 AM
Sqoop currently does not support SSO. I got an individual use created and now it works. Thanks
Created 08-12-2016 07:51 PM
I am running into the same problem. Tried by passing the driver --driver "com.microsoft.sqlserver.jdbc.SQLServerDriver" along with the sqoop command and get this error.
Error executing statement: com.microsoft.sqlserver.jdbc.SQLServerException: Integrated authentication failed. ClientConnectionId:
Can anyone help, if this is resolved.
Created 08-12-2016 07:52 PM
I am running into the same problem. Tried by passing the driver --driver "com.microsoft.sqlserver.jdbc.SQLServerDriver" along with the sqoop command and get this error.
Error executing statement: com.microsoft.sqlserver.jdbc.SQLServerException: Integrated authentication failed. ClientConnectionId:
Can anyone help, if this is resolved.
Created 09-02-2016 01:47 PM
Integrated authentication does not work with SQLServer even in a secure cluster with AD integration as the containers will not have the context (even in secure windows clusters, the impersonation level does not support this)
You can use the password alias (which makes use of the Hadoop credentials support) to use SQL server authentication without exposing the passwords.
Created 09-12-2016 01:28 PM
I am interested too, we have been advised to use either Progress DataDirect driver or the JTDC Driver
Created 02-09-2017 06:19 PM
I spent months on this and was told by Cloudera that it is just not possible. The kerberos tokens are lost when the mappers spawn (as the YARN transitions the job to its internal security subsystem).
I think this is something that would have to be added to Sqoop itself, much like HBase mapreduce jobs have to pass job config to mappers in the code Sqoop able to do the same.
I spent weeks trying different things, doing network traces (all my servers are tied to AD with Centrify, ext), watched the token drop, and finally found a paper written by Yahoo about YARN that explains the internal token subsystem it uses (after using Kerberos to externally verify someone it moves to different token based subsystem for performance).
My company would be willing to pay some $$ to get this working. All our data is being held hostage in SQL Server (from a company we are merging with).
Created 02-10-2017 07:38 AM
@Justin MillerI too faced the same problem, finally i resolved the issue with the below mention procedure,it may help you i hope.
Sqoop SQL Server data import to HDFS worked with manual parametric the authentication(using windows credential) with added parameter on the SQL Server JDBC driver, as integrated security is not supported by the SQL driver as of now due to the Kerberos authentication(Delegated tokens distributed over cluster while running MR job).
So we need to pass the windows authentication with password and with the integrated security disabled mode to import the data to the system. As normal SQL server driver does not support, so I had used the jtds.jar and the different driver class to pull the data to the Hadoop Lake.
Sample Command I tried on the server as follows,
sqoop import --table Table1 --connect "jdbc:jtds:sqlserver://<Hostname>:<Port>;useNTLMv2=true;domain=<WindowsDomainName>;databaseName=XXXXXXXXXXXXX" \
--connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX --password 'XXXXXXX' \
--verbose --target-dir /tmp/33 -m 1 -- --schema dbo
Created 03-13-2017 07:04 PM
Greetings Rajendra,
You have saved my life, your solution works!!! We have spent two years on this thinking there was no way to get Sqoop working, I want to buy you a drink!!
Created 03-14-2017 05:08 AM
@Justin MillerThank you..!!!
This Hortonwork community blog is for mutual knowledge exchange for solving problems, should appreciate all for their time in answering our question.
Created 06-01-2017 07:54 PM
It works, Thank you!!
Created 03-14-2017 03:54 AM
I have found a solution to this provided by another user here: https://community.hortonworks.com/questions/20719/sqoop-to-sql-server-with-integrated-security.html
Basically if you switch to the jtds driver which you can download here: http://jtds.sourceforge.net/
"
Sqoop SQL Server data import to HDFS worked with manual parametric the authentication(using windows credential) with added parameter on the SQL Server JDBC driver, as integrated security is not supported by the SQL driver as of now due to the Kerberos authentication(Delegated tokens distributed over cluster while running MR job).
So we need to pass the windows authentication with password and with the integrated security disabled mode to import the data to the system. As normal SQL server driver does not support, so I had used the jtds.jar and the different driver class to pull the data to the Hadoop Lake.
Sample Command I tried on the server as follows,
sqoop import --table Table1 --connect "jdbc:jtds:sqlserver://<Hostname>:<Port>;useNTLMv2=true;domain=<WindowsDomainName>;databaseName=XXXXXXXXXXXXX" \
--connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX --password 'XXXXXXX' \
--verbose --target-dir /tmp/33 -m 1 -- --schema dbo
"
Here are some examples that worked for me:
# List databases
sqoop list-databases --connect "jdbc:jtds:myactivedirectorydomain.com" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX -P
# List tables
sqoop list-tables --connect "jdbc:jtds:myactivedirectorydomain.com;databaseName=DATABASENAMEHERE" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username jmiller.admin -P
# Pull data example
sqoop import --table TABLENAMEHERE --connect "jdbc:jtds:myactivedirectorydomain.com;databaseName=DATABASENAMEHERE" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX -P --fields-terminated-by '\001' --target-dir /user/XXXXX/20170313 -m 1 -- --schema dbo
Note* In the above example you need to change the username to your username and database name in the list-tables or pull to the one you need (note the AD account you use will require access to the data).
Created 05-13-2020 11:56 PM
Yes this is possible.
You need to kinit with the username that has been granted access to the SQL server DB and tables. integrated security passes your credentials to the SQL server using kerberos
"jdbc:sqlserver://sername.domain.co.za:1433;integratedSecurity=true;databaseName=SCHEMA;authenticationScheme=JavaKerberos;"
This worked for me.
Created 05-29-2021 02:37 PM
@Onedile wrote:Yes this is possible.
You need to kinit with the username that has been granted access to the SQL server DB and tables. integrated security passes your credentials to the SQL server using kerberos
"jdbc:sqlserver://sername.domain.co.za:1433;integratedSecurity=true;databaseName=SCHEMA;authenticationScheme=JavaKerberos;"
This worked for me.
It doesn't work, it's still facing issue with the latest MSSQL JDBC driver as the Kerberos tokens are lost when the mappers spawn (as the YARN transitions the job to its internal security subsystem)
21/05/29 19:00:40 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1616335290043_2743822
21/05/29 19:00:40 INFO mapreduce.JobSubmitter: Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:nameservice1, Ident: (token for c795701: HDFS_DELEGATION_TOKEN owner=c795701@XX.XXXX.XXXXXXX.COM, renewer=yarn, realUser=, issueDate=1622314832608, maxDate=1622919632608, sequenceNumber=29194128, masterKeyId=1856)]
21/05/29 19:01:15 INFO mapreduce.Job: Task Id : attempt_1616335290043_2743822_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.RuntimeException: com.microsoft.sqlserver.jdbc.SQLServerException: Integrated authentication failed. ClientConnectionId:53879236-81e7-4fc6-88b9-c7118c02e7be
Caused by: java.security.PrivilegedActionException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
Use the jtds driver as recommended here