Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sqoop Import Job from Oracle Database is not working as expected

Sqoop Import Job from Oracle Database is not working as expected

New Contributor

 

I have posted this in Hue community and I was told that Sqoop 2 wouldn't work with Oracle Database and I need to check here. Below is what I posted there.

 

We are trying to do an Import of Tables from Oracle Database into Hue using Sqoop 2 and the tables are coming in as blocks without column names.  I have tried lot of things, but not able to figure out on how to get on atleast like a sensible format, where we will be able to query this data in Hive.

 

There is also one more thing, In Table SQL Statement , I have given  " Select * from Table_Name; " and it returned an error saying  "SQL statement must contain placeholder for auto generated conditions - ${CONDITIONS}"

 

So, It would be helpful. If anyone could point me on how to get this done, by using SQL queries and Importing tables as needed. Also what does these below options do? Your help is appreciated.

 

Table column names

Partition column name
Nulls in partition column
Boundary query
 
 
Abhishek
1 REPLY 1
Highlighted

Re: Sqoop Import Job from Oracle Database is not working as expected

Expert Contributor
There is an open Jira for getting Sqoop2 to output data in a format that Hive will accept: https://issues.apache.org/jira/browse/SQOOP-1579. In reality, this shouldn't stop you, but the Jira does have an example that you might be able to use. This will affect you if you have content that needs to be escaped though. I'm on it!

Here's an example of "Table SQL Statement": "SELECT * FROM example WHERE ${CONDITIONS}".

- "Table column names" is used to provide a list of table columns from a table that should be extracted. This should not be used in conjunction with "Table SQL Statement". It should only be used if you provide a "Table name".
- "Partition column name" defines which column should be used to partition the data transfer operation. By default, the primary key of the table will be used. If your table doesn't have a primary key, then this must be set.
- "Null in partition column" is a boolean that defines whether the partition column should accept NULL values.

Hope this helps...