- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Sqoop import all tables
- Labels:
-
Apache Sqoop
Created ‎03-09-2017 09:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I would like to run a command like:
sqoop import-all-tables --driver com.ibm.db2.jcc.DB2Driver --connect jdbc:db2://localhost/testdb --username username --password password -- --schema the_schema --hive-database TestDB --hive-import
However, the "-- --schema the_schema" is not being picked up. Instead, all the tables are assuming that "testdb" is the schema instead of "the_schema".
Is there some way to correct this? we would like to the import-all-tables to work as we have many tables that maybe changing throughout the life of the database and hadoop.
Thanks,
Marcia
Created ‎03-13-2017 06:16 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
-D mapreduce.map.memory.mb=2048-D mapreduce.map.java.opts=-Xmx1024m . Change the memory acccordingly based on your cluster and try it.
Created ‎03-13-2017 06:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
few lines prior to this?
Created ‎03-13-2017 06:49 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you check what it he contents of PROJECT_ID, looks like the issue is on that. What type of field is that
Created ‎03-13-2017 06:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
project_id is INTEGER
Created ‎03-13-2017 07:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That is something related to DB2 you need to resolve.
-206 object-name IS NOT VALID IN THE CONTEXT WHERE IT IS USED
https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/codes/src/tpc/n206.html
To isolate the problem just get that single table and see how it goes.
Created ‎03-13-2017 07:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think the problem is regarding importalltables and schema...
I wish to import all the tables within one schema... is this possible using import-all-tables? Or must I go table by table??
Created ‎03-13-2017 07:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As I mentioned --schema is not supported by import-all-tables. Looks like it takes the db name as the schema name by default. or if you have write access to db2, create a schema with same name and create aliases/synonyms or all tables in that schema. should work.
Created ‎03-13-2017 07:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, this sounds like an excellent idea!
Please let me know how I should do this...
Created ‎03-13-2017 07:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So my new schema should be called: TESTDB .
And all the tables in SCHEMA should have aliases within the new schema, TESTDB?
Also, which do I use? Alias / Synonym?
Does the username, SCHEMA, do anything with the settings as well?
I'm thinking of doing the following in DB2:
- create schema TESTDB:
- create alias TESTDB.RC_SUM for originalSchema.RC_SUM;
Created ‎03-14-2017 03:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have found a solution to this provided by another user here: https://community.hortonworks.com/questions/20719/sqoop-to-sql-server-with-integrated-security.html
Basically if you switch to the jtds driver which you can download here: http://jtds.sourceforge.net/
"
Sqoop SQL Server data import to HDFS worked with manual parametric the authentication(using windows credential) with added parameter on the SQL Server JDBC driver, as integrated security is not supported by the SQL driver as of now due to the Kerberos authentication(Delegated tokens distributed over cluster while running MR job).
So we need to pass the windows authentication with password and with the integrated security disabled mode to import the data to the system. As normal SQL server driver does not support, so I had used the jtds.jar and the different driver class to pull the data to the Hadoop Lake.
Sample Command I tried on the server as follows,
sqoop import --table Table1 --connect "jdbc:jtds:sqlserver://<Hostname>:<Port>;useNTLMv2=true;domain=<WindowsDomainName>;databaseName=XXXXXXXXXXXXX" \
--connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX --password 'XXXXXXX' \
--verbose --target-dir /tmp/33 -m 1 -- --schema dbo
"
Here are some examples that worked for me:
# List databases
sqoop list-databases --connect "jdbc:jtds:myactivedirectorydomain.com" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX -P
# List tables
sqoop list-tables --connect "jdbc:jtds:myactivedirectorydomain.com;databaseName=DATABASENAMEHERE" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username jmiller.admin -P
# Pull data example
sqoop import --table TABLENAMEHERE --connect "jdbc:jtds:myactivedirectorydomain.com;databaseName=DATABASENAMEHERE" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX -P --fields-terminated-by '\001' --target-dir /user/XXXXX/20170313 -m 1 -- --schema dbo
Note* In the above example you need to change the username to your username and database name in the list-tables or pull to the one you need (note the AD account you use will require access to the data).
Created ‎03-14-2017 01:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, but I'm using DB2.
Also, I can import one table just fine...
I just cannot import all tables...
