- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
*Closed* : How to import mutilple tables (not all tables) in Sqoop
- Labels:
-
Apache Hive
-
Apache Sqoop
Created ‎06-04-2018 12:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I want to import specific tables from MsSQL to HIVE database. Although I have tried --exclude-tables arguments for excluding some table while importing.
But my scenario is a little bit different.
For Example, If I have 100 tables and I want to import only 98 tables then I could exclude those 2 tables using --exclude-tables argument. But what if I want to import only 2 tables among those 100 tables.
I tried to give multiple tables names in --table argument.
My observation:
import-all-tables --connect jdbc:sqlserver://<HOST>:<port>;databasename=<mssql_database_name> --username xxxxx --password xxxx --table mssql_table1,mssql_table2 --hive-import --hive-database <hive_database_name> --fields-terminated-by "," -m 1
Does anyone have an idea?
Regards,
Jay.
Created ‎06-04-2018 01:04 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One way of doing is using --exclude-tables argument and mentioning all the tables with comma separated values.
(or)
Writing shell script to to have the required two tables and iterating sqoop job on these tables.
Created ‎06-04-2018 01:04 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One way of doing is using --exclude-tables argument and mentioning all the tables with comma separated values.
(or)
Writing shell script to to have the required two tables and iterating sqoop job on these tables.
Created ‎06-04-2018 02:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey @Shu, I have gone through both links which you have given.
This link this for "how to import data from multiple source thru Sqoop?"
In this link, there is no idea about import only specific tables using Sqoop. They created script for importing tables and aslo mentioning those 100 tables names. i.e., table1, table2,table3,table4,table5,table6,... show on.
But I don't want to mention those all tables names.
Regards,
Jay.
Created ‎06-05-2018 01:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Let me explain how to do your case using Shell script
1.Using whileloop:
Create a input file with all the required tables i.e
bash$ vi required_tables.txt
mssql_table1
mssql_table2
Then the shell script will read the above file data line by line and executes sqoop import for each table
bash$ vi sqoop_import_while.sh
while read line; do tableName=`echo $line | cut -d'.' -f2` sqoop import --connect jdbc:sqlserver://<HOST>:<port>;databasename=<mssql_database_name> --username xxxxx - -password xxxx --table $tableName --hive-import --hive-database <hive_database_name> --fields-terminated-by ',' -m 1 done </home/required_table.txt(give your required_tables.txt file path)
Now the script reads each line from required_table.txt file and assigns the value to tableName variable then we are using tableName variable in our sqoop import staetment.
Once the import is finished for the first table then script reads the next line and performs import again for the second table.
2.Using Forloop:
bash$ vi sqoop_import_for.sh declare -a req_tables=("mssql_table1" "mssql_table2") for t in "${req_tables[@]}" do sqoop import --connect jdbc:sqlserver://<HOST>:<port>;databasename=<mssql_database_name> --username xxxxx --password xxxx --table $t --hive-import --hive-database <hive_database_name> --fields-terminated-by ',' -m 1 done
By using this script we are not reading the table names from the file instead we are defining an array variable and iterating through all the array elements and performing sqoop import.
We have defined req_tables array and given all the required tables then performing for loop each table in req_tables array and passing the name to --table argument in sqoop import.
give the permissions to the shell script file then execute the script by using ./sqoop_import_for.sh (or) ./sqoop_import_while.sh
You can choose either of the above ways to import only the required tables using sqoop.
Let us know if you are facing issues ..!!
Created ‎07-06-2018 07:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎06-04-2018 01:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When you want to import ONLY a subset of the database you don't use "import-all-tables"
$ sqoop import --connect jdbc:sqlserver://<HOST>:<port>;databasename=<mssql_database_name> --username foo --password foo --table mssql_table1,mssql_table2 --hive-import
Please try the above method and revert
Created ‎06-04-2018 01:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, @Geoffrey Shelton Okot, thanks
Yes , I tried --exclude-tables argument to exclude some of the tables. But what if I want to import only 2 tables out of 100 tables from MsSQL to Hive database. Do I need to mention all 98 tables names in --exclude-tables argument?
Created ‎06-22-2018 11:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tried above solution, but it is also throwing below error;
18/06/22 11:53:59 ERROR manager.SqlManager: Error executing statement: com.microsoft.sqlserver.jdbc.SQLServerException: Invalid object name '<ms_SQL_tbalename>,<ms_SQl_Tablename>'.
Regards,
Jay.
Created ‎06-04-2018 01:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Why go the lengthy tedious way to exclude 98 tables when you can just mention the 2 tables you want imported?
Created ‎06-04-2018 01:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Geoffrey Shelton Okot, thank you,
But 98 is just a number. Let's think about bigger number and change scenario.
What if MsSQL database has 500 tables and I want to import only 100 tables among them? Do I need to mention those all 100 tables in --table argument?
Regards,
Jay.
