Created 01-18-2016 06:46 AM
On HDP 2.3.2 with Sqoop 1.4.6, I'm trying to import tables from SQL Server 2008.
I'm able to successfully connect to the SQL Server because I can list databases and tables etc.
However, every single time during imports I run into the following error:
Error: java.lang.RuntimeException: java.lang.RuntimeException: com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host x.x.x.x, port 1433 has failed. Error: "connect timed out. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.".
Again, I am actually able to successfully import from SQL Server, but only after a couple of retries. However, regardless of whether the import succeeded or failed, I always get the error mentioned above and I was wondering what could be causing the problem? It's rather cumbersome to have to keep repeating the imports whenever they fail.
I've already turned off the connection time-out on the SQL Server, and though the connection from the Hadoop cluster and the SQL Server passes through our corporate firewall, our admins tell me that the timeout on the firewall is 3600 seconds. The imports fail before getting anywhere near that mark.
Just an example of one of the sqoop commands I use:
sqoop import \ -D mapred.task.timeout=0 \ --connect "jdbc:sqlserver://x.x.x.x:1433;database=CEMHistorical" \ --table MsgCallRefusal \ --username hadoop \ --password-file hdfs:///user/sqoop/.adg.password \ --hive-import \ --hive-overwrite \ --create-hive-table \ --split-by TimeStamp \ --hive-table develop.oozie \ --map-column-hive Call_ID=STRING,Stream_ID=STRING,AgentGlobal_ID=STRING
Update:
After getting in touch with our network team, it seems this is most definitely a network issue. To add context, the Hadoop cluster is on a different VLAN as the SQL Server and it goes through a number of firewalls. To test, I tried importing from a different SQL Server within the same VLAN as the Hadoop cluster and I didn't encounter this exception at all.
Created 01-26-2016 08:00 AM
I never heard back from our network team with regards to firewall logs, but our NameNode's OS got corrupted and had to be reformatted and HDP reinstalled. For some reason we're not encountering this error any longer.
One difference between the original cluster and the new installation is that we had 4 nodes (1 name node and 3 data nodes) which were virtualized in a single server. Now, we're running a single node cluster (HDP 2.3.4) with no virtualization on the server.
Created 02-02-2016 03:21 PM
@Luis Antonio Torres has this been resolved? Can you post your solution or accept best answer?
Created 02-03-2016 12:05 AM
@Artem Ervits yes, this has been resolved but I didn't accept my own answer because we never found out what exactly was wrong. The issue merely stopped occurring after we reinstalled HDP on a server with no virtualization - I detail this in my answer. In any case, I've accepted my own answer as a reference, for the benefit of others.Thanks!
Created 11-10-2017 06:53 AM
@Luis Antonio Torres Listing databases (Sqoop eval) just needs the Namenode access to sql server. Whereas, for import/export (mapreduce tasks) Datanodes also need to have access to the sqlserver. I know your issue is resolved and this is a late reply but someone else might be looking for this in future.