Created 06-22-2016 04:22 AM
Hi,
I'm trying to run sqoop incremental imports from Teradata to HDFS but running into issues with time stamp.
my sqoop command:
sqoop import \
--connect jdbc:teradata://PH/DATABASE=hadoop,CHARSET=UTF8 \
--driver "com.teradata.jdbc.TeraDriver" \
--username hadoop \
--password Hadoop \
--table item \
--target-dir /diva/item_text \
--m 1 \
--fields-terminated-by '|' \
--incremental append -check-column dw_modify_ts --last-value '2013-01-01 02:00:00'
Error Message:
Error: java.io.IOException: SQLException in nextKeyValue
Caused by: java.sql.SQLException: [Teradata Database] [TeraJDBC 15.00.00.20] [Error 6760] [SQLState HY000] Invalid timestamp.
Log Message:
-----------------------------------
16/06/22 04:13:43 INFO tool.ImportTool: Incremental import based on column dw_modify_ts
16/06/22 04:13:43 INFO tool.ImportTool: Lower bound value: '2013-01-01 02:00:00'
16/06/22 04:13:43 INFO tool.ImportTool: Upper bound value: '2013-07-10 22:56:01.0'
16/06/22 04:13:43 INFO mapreduce.ImportJobBase: Beginning import of e2_item_category_text
Here as per above log sqoop giving milliseconds as well in Upper bound Value "'2013-07-10 22:56:01.0'?
I'm suspecting that this milliseconds format having an issue while importing data from Teradata.
Can you suggest me if you have any idea?
Created 06-22-2016 05:22 AM
There appears to be a bug in Sqoop for lastmodified import from Teradata. Regarding "append" import can you try to use an INT column as "check-column".
Created 06-22-2016 04:29 AM
I'm thinking that the value of '0000-00-00 00:00:00' date is stored in our db but can't be handled by the driver.
Created 06-22-2016 05:07 AM
below script is running fine, then no issues with timestamp running scripts in Teradata, it seems issue with incremental or Upper value milliseconds format.
sqoop import \
--connect jdbc:teradata://abc/DATABASE=hadoop,CHARSET=UTF8 \
--driver "com.teradata.jdbc.TeraDriver" \
--username hadoop \
--password 00000 \
--query "select * from text where dw_load_ts >='2016-01-01 00:00:00' AND \$CONDITIONS" \
--target-dir /diva/text \
--m 1
Created 06-22-2016 04:37 AM
Changed incremental to lastmodified "--incremental lastmodified" in sqoop script but still getting an error.
16/06/22 04:35:16 ERROR manager.SqlManager: SQL exception accessing current timestamp: java.sql.SQLException: [Teradata Database] [TeraJDBC 15.00.00.20] [Error 3706] [SQLState 42000] Syntax error: expected something between '(' and ')'. java.sql.SQLException: [Teradata Database] [TeraJDBC 15.00.00.20] [Error 3706] [SQLState 42000] Syntax error: expected something between '(' and ')'.
ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Could not get current time from database
Created 06-22-2016 05:22 AM
There appears to be a bug in Sqoop for lastmodified import from Teradata. Regarding "append" import can you try to use an INT column as "check-column".
Created 06-22-2016 06:04 AM
Thanks Predrag.
it's working for "append" and INT column as "check-column"
another quick question:
Do we need to pass "--last -value" every time based on previous "Upper bound value:####"?
Created 06-22-2016 07:28 AM
Glad to hear it worked! Regarding "last-value" it's the best to create a Sqoop job, for example
sqoop job --create myjob -- import --connect jdbc:teradata ...
and then just execute "sqoop job --exec myjob" every time. Sqoop will memorize last-value in its internal storage. For a "lastmodified" example see the answer to this post.