Support Questions

Find answers, ask questions, and share your expertise

Sqoop job -exec from SQL server into kerberized cluster

avatar
Rising Star

Hi,

Good evening,

I have created a job to import data from SQL server and when I tried to execute the job using

sqoop job -exec job.my.Account

I am getting the folloiwing exception

16/09/16 01:39:38 INFO hcat.SqoopHCatUtilities: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 16/09/16 01:39:47 INFO hcat.SqoopHCatUtilities: FAILED: SemanticException MetaException(message:org.apache.hadoop.ipc.RemoteException(onException): Unauthorized connection for super-user: hive/n02.myserver.com@MYSERVER.COM from IP xx.xx.xx.5) 16/09/16 01:39:48 DEBUG util.ClassLoaderStack: Restoring classloader: sun.misc.Launcher$AppClassLoader@33c7e1bb 16/09/16 01:39:48 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: HCat exited with status 64 at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.executeExternalHCatProgram(SqoopHCatUtilities.java:1196) at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.launchHCatCli(SqoopHCatUtilities.java:1145) at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.createHCatTable(SqoopHCatUtilities.java:679) at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:342) at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureImportOutputFormat(SqoopHCatUtilities.java:848) at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:102) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:263) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692) at org.apache.sqoop.manager.SQLServerManager.importTable(SQLServerManager.java:163) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:507) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:243) at org.apache.sqoop.tool.JobTool.run(JobTool.java:298) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.main(Sqoop.java:243)

The same job works fine in without security on ( with out Kerboriztion)

I configured the following in core-site.

<property> <name>hadoop.proxyuser.hcat.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hcat.groups</name> <value>*</value> </property>

Can any one help.

thanks

Ram

1 ACCEPTED SOLUTION

avatar
Rising Star

Hi Good morning, I added the following to core-site.xml and restarted HDFS, YARN and MAPReduce

<property> <name>hadoop.proxyuser.hive.hosts</name> <value>*</value> </property>

and I am able to execute the sqoop Job.

thanks

ram

View solution in original post

2 REPLIES 2

avatar
Rising Star

Hi Good morning, I added the following to core-site.xml and restarted HDFS, YARN and MAPReduce

<property> <name>hadoop.proxyuser.hive.hosts</name> <value>*</value> </property>

and I am able to execute the sqoop Job.

thanks

ram

avatar
Contributor

I have found a solution to this provided by another user here: https://community.hortonworks.com/questions/20719/sqoop-to-sql-server-with-integrated-security.html

Basically if you switch to the jtds driver which you can download here: http://jtds.sourceforge.net/

Per Rajendra Manjunath

"

Sqoop SQL Server data import to HDFS worked with manual parametric the authentication(using windows credential) with added parameter on the SQL Server JDBC driver, as integrated security is not supported by the SQL driver as of now due to the Kerberos authentication(Delegated tokens distributed over cluster while running MR job).

So we need to pass the windows authentication with password and with the integrated security disabled mode to import the data to the system. As normal SQL server driver does not support, so I had used the jtds.jar and the different driver class to pull the data to the Hadoop Lake.

Sample Command I tried on the server as follows,

sqoop import --table Table1 --connect "jdbc:jtds:sqlserver://<Hostname>:<Port>;useNTLMv2=true;domain=<WindowsDomainName>;databaseName=XXXXXXXXXXXXX" \

--connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX --password 'XXXXXXX' \

--verbose --target-dir /tmp/33 -m 1 -- --schema dbo

"

Here are some examples that worked for me:

# List databases

sqoop list-databases --connect "jdbc:jtds:myactivedirectorydomain.com" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX -P

# List tables

sqoop list-tables --connect "jdbc:jtds:myactivedirectorydomain.com;databaseName=DATABASENAMEHERE" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username jmiller.admin -P

# Pull data example

sqoop import --table TABLENAMEHERE --connect "jdbc:jtds:myactivedirectorydomain.com;databaseName=DATABASENAMEHERE" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX -P --fields-terminated-by '\001' --target-dir /user/XXXXX/20170313 -m 1 -- --schema dbo

Note* In the above example you need to change the username to your username and database name in the list-tables or pull to the one you need (note the AD account you use will require access to the data).