Created 03-09-2017 09:42 PM
Hello,
I would like to run a command like:
sqoop import-all-tables --driver com.ibm.db2.jcc.DB2Driver --connect jdbc:db2://localhost/testdb --username username --password password -- --schema the_schema --hive-database TestDB --hive-import
However, the "-- --schema the_schema" is not being picked up. Instead, all the tables are assuming that "testdb" is the schema instead of "the_schema".
Is there some way to correct this? we would like to the import-all-tables to work as we have many tables that maybe changing throughout the life of the database and hadoop.
Thanks,
Marcia
Created 03-13-2017 06:16 PM
-D mapreduce.map.memory.mb=2048-D mapreduce.map.java.opts=-Xmx1024m . Change the memory acccordingly based on your cluster and try it.
Created 03-13-2017 06:44 PM
few lines prior to this?
Created 03-13-2017 06:49 PM
Can you check what it he contents of PROJECT_ID, looks like the issue is on that. What type of field is that
Created 03-13-2017 06:56 PM
project_id is INTEGER
Created 03-13-2017 07:01 PM
That is something related to DB2 you need to resolve.
-206 object-name IS NOT VALID IN THE CONTEXT WHERE IT IS USED
https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/codes/src/tpc/n206.html
To isolate the problem just get that single table and see how it goes.
Created 03-13-2017 07:03 PM
I think the problem is regarding importalltables and schema...
I wish to import all the tables within one schema... is this possible using import-all-tables? Or must I go table by table??
Created 03-13-2017 07:06 PM
As I mentioned --schema is not supported by import-all-tables. Looks like it takes the db name as the schema name by default. or if you have write access to db2, create a schema with same name and create aliases/synonyms or all tables in that schema. should work.
Created 03-13-2017 07:11 PM
Hello, this sounds like an excellent idea!
Please let me know how I should do this...
Created 03-13-2017 07:19 PM
So my new schema should be called: TESTDB .
And all the tables in SCHEMA should have aliases within the new schema, TESTDB?
Also, which do I use? Alias / Synonym?
Does the username, SCHEMA, do anything with the settings as well?
I'm thinking of doing the following in DB2:
Created 03-14-2017 03:55 AM
I have found a solution to this provided by another user here: https://community.hortonworks.com/questions/20719/sqoop-to-sql-server-with-integrated-security.html
Basically if you switch to the jtds driver which you can download here: http://jtds.sourceforge.net/
"
Sqoop SQL Server data import to HDFS worked with manual parametric the authentication(using windows credential) with added parameter on the SQL Server JDBC driver, as integrated security is not supported by the SQL driver as of now due to the Kerberos authentication(Delegated tokens distributed over cluster while running MR job).
So we need to pass the windows authentication with password and with the integrated security disabled mode to import the data to the system. As normal SQL server driver does not support, so I had used the jtds.jar and the different driver class to pull the data to the Hadoop Lake.
Sample Command I tried on the server as follows,
sqoop import --table Table1 --connect "jdbc:jtds:sqlserver://<Hostname>:<Port>;useNTLMv2=true;domain=<WindowsDomainName>;databaseName=XXXXXXXXXXXXX" \
--connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX --password 'XXXXXXX' \
--verbose --target-dir /tmp/33 -m 1 -- --schema dbo
"
Here are some examples that worked for me:
# List databases
sqoop list-databases --connect "jdbc:jtds:myactivedirectorydomain.com" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX -P
# List tables
sqoop list-tables --connect "jdbc:jtds:myactivedirectorydomain.com;databaseName=DATABASENAMEHERE" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username jmiller.admin -P
# Pull data example
sqoop import --table TABLENAMEHERE --connect "jdbc:jtds:myactivedirectorydomain.com;databaseName=DATABASENAMEHERE" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username XXXXX -P --fields-terminated-by '\001' --target-dir /user/XXXXX/20170313 -m 1 -- --schema dbo
Note* In the above example you need to change the username to your username and database name in the list-tables or pull to the one you need (note the AD account you use will require access to the data).
Created 03-14-2017 01:05 PM
Thanks, but I'm using DB2.
Also, I can import one table just fine...
I just cannot import all tables...