Created 07-05-2016 11:09 AM
Hi All
I have a table in oracle with only 4 columns...
Memberid --- bigint
uuid --- String
insertdate --- date
updatedate --- date
I want to import those data in HIVE table using sqoop. I create corresponding HIVE table with
create EXTERNAL TABLE memberimport(memberid BIGINT,uuid varchar(36),insertdate timestamp,updatedate timestamp)LOCATION '/user/import/memberimport';
and sqoop command
sqoop import --connect jdbc:oracle:thin:@dbURL:1521/dbName --username ** --password *** --hive-import --table MEMBER --columns 'MEMBERID,UUID,INSERTDATE,UPDATEDATE' --map-column-hive MEMBERID=BIGINT,UUID=STRING,INSERTDATE=TIMESTAMP,UPDATEDATE=TIMESTAMP --hive-table memberimport -m 1
Its working properly and able to import data in HIVE table.
Now I want to update this table with incremental update with updatedate (last value today's date) so that I can get day to day update for that OLTP table into my HIVE table using sqoop.
For Incremental import I am using following sqoop command
sqoop import --hive-import --connect jdbc:oracle:thin:@dbURL:1521/dbName --username *** --password *** --table MEMBER --check-column UPDATEDATE --incremental append --columns 'MEMBERID,UUID,INSERTDATE,UPDATEDATE' --map-column-hive MEMBERID=BIGINT,UUID=STRING,INSERTDATE=TIMESTAMP,UPDATEDATE=TIMESTAMP --hive-table memberimport -m 1
But I am getting exception
"Append mode for hive imports is not yet supported. Please remove the parameter --append-mode" and as per documentation in sqoop "--hive-import" does not support with "--incremental append"
When I remove the "--hive-import" it run properly and file in getting created in sqoop location in HDFS but I did not found those new update in HIVE table that I have in OLTP table.
When I run....
"select * from hive-table "
It is showing the old data only no incremental data.
Am I doing anything wrong ? Please suggest me how can I run incremental update with Oracle - Hive using sqoop.
Any help will be appropriated..
Thanks in Advance ...
Created 07-05-2016 11:55 AM
One solution is to remove "--hive-import" for incremental imports (after the first one -- what you already tried) and also to set "--target-dir /user/import/memberimport". In this way you are adding new entries to the Hive table's directory and therefore the Hive table will be updated as expected.
Created 07-05-2016 11:20 AM
After the initial import, subsequent imports can leverage SQOOP’s native support for “Incremental Import” by using the “check-column”, “incremental” and “last-value” parameters.
http://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/
Created 07-05-2016 02:08 PM
Thanks a lot .... You save my day .... 🙂
I missed the --target-dir.
Created 07-05-2016 02:14 PM
Cool, but you accepted the wrong answer 🙂
Created 07-05-2016 11:55 AM
One solution is to remove "--hive-import" for incremental imports (after the first one -- what you already tried) and also to set "--target-dir /user/import/memberimport". In this way you are adding new entries to the Hive table's directory and therefore the Hive table will be updated as expected.