Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Sqoop cannot load a driver class, SQL Server, when creating a connection

avatar
Explorer

Hi Foks, I am not that savy in Linux.

 

I am trying to creat a new connection to SQL Server on my laptop and Cloud. Niether of them working.

 

Here is what I have done:

- Using Quickstart VM CDH 4.6

- Usoing VMWare Player for quess session

- Installed/copied sqljdbc.jar in /usr/lib/sqoop/lib/

- export CLASSPATH=/usr/lib/sqoop/lib/sqljdbc4.jar in a terminal

- run this command on the same terminal... $ sqoop eval --connect jdbc:sqlserver://XXXX.cloudapp.net,1433\\XXXX(instance name) --username <myuser> --password <maypassword> --driver com.microsoft.sqlserver.jdbc.SQLServerDriver --query "SELECT * FROM [ContosoRetailDW].[dbo].[DimAccount]"

- I do the same for local machine, the host machine

 

Here is the error i get for both cases:

 WARN tool.EvalSqlTool: SQL exception executing statement: com.microsoft.sqlserver.jdbc.SQLServerException: The connection to the host dssvmprod.cloudapp.net,1433, named instance DSSPRODDB failed. Error: "java.net.UnknownHostException: dssvmprod.cloudapp.net,1433". Verify the server and instance names and check that no firewall is blocking UDP traffic to port 1434.  For SQL Server 2005 or later, verify that the SQL Server Browser Service is running on the host.

 

Info...

I can connect to Cloud database from guess VM without any issue

 

Here are my questions:

1- What am I doing wrong connecting from a terminal?

2- When I open another terminal the $CLASSPATH won't be recognized. How can I make the CLASSPATH perminant.

2- What should I do or set that the SQL Server driver can be loaded by Sqoop2 user interface? Does sqoop2 interface uses $CLASSPATH?

 

Many thanks for your help

 

 

1 ACCEPTED SOLUTION

avatar
Explorer

Finally I got it working:

 

Lessons learned.....

 

Import table to HDFS using sqoop2:

  • Create      connection ….
    • Name:       Connection1
    • JDBC       Driver: com.microsoft.sqlserver.jdbc.SQLServerDriver
    • JDBC       Connection String:       jdbc:sqlserver://192.168.1.9\DSSSQL2012
    • User Name: yourDBUsername
    • Password: YourDBPassword
  • Create job, after clicking "Create Job" ….
    • Job Name: TableLoad
    • Connection:       Connection1
    • Schema       Name: Leave it blank/null
    • Table Name: DimAccount
    • OR Table       SQL Statement: SELECT+*+FROM+DimAccount+where+${CONDITIONS}
    • Table Column Name: AccountKey,AccountDescription,AccountType
    • Partition Column Name:       AccountKey
    • Null in Partition Column: No     
    • Boundary Query: Select min(AccountKey),max(AccountKey) from DimAccount
    • Storage Type: HDFS
    • Output Format: Text File
    • Output Directory:  /tmp/sqoop2-outpout/TableLoad1 ; This is a HDFS directory and you should not create it. The job process will create the folder and related files. The node folder should not exist
    • Extractors:  ?? I don't what this is, and is not a show stopper
    • Loaders: ??  I don't what this is, and is not a show stopper

Goodluck to the new explorers....

View solution in original post

9 REPLIES 9

avatar

Can you try using the ip address instead of the name?  See if that works...  I've noticed in our environment, it's more reliable to use the ip address rather than the name itself.

avatar
Explorer

Thanks Jason for your reply.

 

After lots of trial and error, we got all the jinx out or sqoop functionality from terminal. We can import/export from table, database and with SQL scripts. However, we cannot do the same thing from sqoop UI. In Hue/sqoop  interface, we can create connection without any error; create a job without any error; running a job without any error; however, no outputs. As I have mentioned, we have created many outputs in /usr/Cloudera/{one folder for each job} when running sqoop from terminal.

 

Any idea  where we should look for the issue?

 

Thanks.

 

avatar

First, I'd log into Cloudera Manager, and go to the Diagnostics link at the top, and select logs.  Then deselect "select/check all" and then specifically select Sqoop.  Do a search on those logs at either the warning or error level. It should tell you what is happening while you're running the sqoop job.

 

One thing to note, within the Sqoop UI (maybe there's a configuration setting somewhere) when you specify the output directory, it has to be a directory that doesn't already exist.  Every time the job runs it will try to create that directory, and if it exists it will err.

 

 

avatar
Explorer

FYI.... here is how one can get the logfile of the job; http://localhost.localdomain:8888/logs

 

I will let you how I resolve the issue, if and when I do.

 

Thanks

avatar
Explorer

Finally I got it working:

 

Lessons learned.....

 

Import table to HDFS using sqoop2:

  • Create      connection ….
    • Name:       Connection1
    • JDBC       Driver: com.microsoft.sqlserver.jdbc.SQLServerDriver
    • JDBC       Connection String:       jdbc:sqlserver://192.168.1.9\DSSSQL2012
    • User Name: yourDBUsername
    • Password: YourDBPassword
  • Create job, after clicking "Create Job" ….
    • Job Name: TableLoad
    • Connection:       Connection1
    • Schema       Name: Leave it blank/null
    • Table Name: DimAccount
    • OR Table       SQL Statement: SELECT+*+FROM+DimAccount+where+${CONDITIONS}
    • Table Column Name: AccountKey,AccountDescription,AccountType
    • Partition Column Name:       AccountKey
    • Null in Partition Column: No     
    • Boundary Query: Select min(AccountKey),max(AccountKey) from DimAccount
    • Storage Type: HDFS
    • Output Format: Text File
    • Output Directory:  /tmp/sqoop2-outpout/TableLoad1 ; This is a HDFS directory and you should not create it. The job process will create the folder and related files. The node folder should not exist
    • Extractors:  ?? I don't what this is, and is not a show stopper
    • Loaders: ??  I don't what this is, and is not a show stopper

Goodluck to the new explorers....

avatar
New Contributor

Which connector was used to solve this issue ? I am using the generic jdbc connector that came with my sqoop 1.99.5. I have no luck with the class name of the microsoft jdbc driver that you mentioned. It says the class is not found. I am assuming it has to do with the connector and not the driver. As I have placed my sqlserver jar file in sqoop library. However, in my cloudera quickstart terminal,when I list files in the sqoop library, I find that the "mysql-connector-java.jar" is color coded in blue and "sqljdbc41.jar" is color coded in red. Any light on what does this mean? 

 

Thank you for your time.

 

avatar
Champion

@Tachyons what database do you want to connect ? 

avatar
New Contributor

My database runs on Microsoft Sql Server.

avatar
New Contributor

Hello folks, I am facing knowledge gap while trying to work with Cloudera CHD with AWS services . The issues enlisted down, I am pretty sure these have been faced and resolved, given the popularity of cloud hosted ms-sql servers. Would be grateful to recieve help

  1. I was able to solve the "class not found"issue by downloading another jar file from this website : http://www.java2s.com/Code/Jar/s/Downloadsqljdbc420jar.htm  and added to the sqoop2 library. This is done with the Generic-jdbc- connector.
  2. However, I am not able to establish the connection , and I suspect the problem is with the JDBC connection string.
  3. I am supposed to connect  from the sqoop's Cloudera Quickstart VM to MS-SQL server that is running on the AWS RDS instance.
  4. So the syntax  of the Connection string I have given is as follows:   jdbc:sqlserver://<RDS end point DNS>:<Port Name>/<DataBase name> 
  5. I have configured the security group of my RDS instance to allow outbound connections, so this could rule out the firewall issues(I suppose)
  6. I have an understanding void with the  RDS end point DNS value being resolved into an IP address 
  7. The IP address the DNS lookup provides isn't static and tend to chane for failover reasons, could this be the issue to establish a connection?

Thank you for your time.