Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to import mutilple tables not all tables in sqoop

avatar
Explorer

Hi, i have tried importing data from oracle and mysql to hive and HDFS using sqoop. I want to import selected multiple tables and selected multiple columns from those tables.

 

I did not find anything regarding this. A post in stack overflow says it cannot http://stackoverflow.com/questions/17194232/sqoop-import-multiple-tables

 

I want to confirm it. Please let me know any pointers in this.

I am not restricted to sqoop open to any technology which suffice my requirement.

My requirement is to import data from multiple tables from data in oracle db into nosql db like hive and run queries in hive and generate reports fast. in oracle some reports are taking 30-40 hours.  

 

 

1 ACCEPTED SOLUTION

avatar
Cloudera Employee

Hi Bas, this sounds like a good scenario in which to use the import-all-tables tool (http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_literal_sqoop_import_all_tables_literal) along with the --exclude-tables <tables> parameter, which is a comma separated list of tables to exclude from the import process.

 

Regards, Kathleen

View solution in original post

2 REPLIES 2

avatar
Cloudera Employee

Hi Bas, this sounds like a good scenario in which to use the import-all-tables tool (http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_literal_sqoop_import_all_tables_literal) along with the --exclude-tables <tables> parameter, which is a comma separated list of tables to exclude from the import process.

 

Regards, Kathleen

avatar
Explorer

Hi Kathleen, thanks for reply. While importing a single table with the help of this command below, i can import only the required columns.

 

sqoop-import --connect jdbc:oracle:thin:@****************:1521/** --username ** --password ** --table REGACCOUNT --columns ACCOUNTNUMBER,ISREGISTERED,COMPANYNAME,DOMAINNAME, -m 1

 

As you suggested using exclude table command i can import multiple tables, which is very helpful pointer, thank you but proceeding furthur is there any option to select only the required columns from those tables or would you suggest to importing the all required tables into hive and then making changes in our hive query to generate reports (since there are large columns like id in insurance DB which are not useful).