Support Questions

Find answers, ask questions, and share your expertise

Incrimental update in HIVE table using sqoop

avatar
Contributor

Hi All

I have a table in oracle with only 4 columns...

Memberid --- bigint

uuid --- String

insertdate --- date

updatedate --- date

I want to import those data in HIVE table using sqoop. I create corresponding HIVE table with

create  EXTERNAL TABLE memberimport(memberid BIGINT,uuid varchar(36),insertdate timestamp,updatedate timestamp)LOCATION '/user/import/memberimport';

and sqoop command

sqoop import --connect jdbc:oracle:thin:@dbURL:1521/dbName --username ** --password *** --hive-import --table MEMBER --columns 'MEMBERID,UUID,INSERTDATE,UPDATEDATE' --map-column-hive MEMBERID=BIGINT,UUID=STRING,INSERTDATE=TIMESTAMP,UPDATEDATE=TIMESTAMP --hive-table memberimport -m 1   

Its working properly and able to import data in HIVE table.

Now I want to update this table with incremental update with updatedate (last value today's date) so that I can get day to day update for that OLTP table into my HIVE table using sqoop.

For Incremental import I am using following sqoop command

sqoop import --hive-import --connect jdbc:oracle:thin:@dbURL:1521/dbName --username *** --password *** --table MEMBER --check-column UPDATEDATE --incremental append --columns 'MEMBERID,UUID,INSERTDATE,UPDATEDATE' --map-column-hive MEMBERID=BIGINT,UUID=STRING,INSERTDATE=TIMESTAMP,UPDATEDATE=TIMESTAMP --hive-table memberimport -m 1

But I am getting exception

"Append mode for hive imports is not yet supported. Please remove the parameter --append-mode" and as per documentation in sqoop "--hive-import" does not support with "--incremental append"

When I remove the "--hive-import" it run properly and file in getting created in sqoop location in HDFS but I did not found those new update in HIVE table that I have in OLTP table.

When I run....

"select * from hive-table "

It is showing the old data only no incremental data.

Am I doing anything wrong ? Please suggest me how can I run incremental update with Oracle - Hive using sqoop.

Any help will be appropriated..

Thanks in Advance ...

1 ACCEPTED SOLUTION

avatar
Master Guru

One solution is to remove "--hive-import" for incremental imports (after the first one -- what you already tried) and also to set "--target-dir /user/import/memberimport". In this way you are adding new entries to the Hive table's directory and therefore the Hive table will be updated as expected.

View solution in original post

4 REPLIES 4

avatar

After the initial import, subsequent imports can leverage SQOOP’s native support for “Incremental Import” by using the “check-column”, “incremental” and “last-value” parameters.

http://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/

avatar
Contributor

Thanks a lot .... You save my day .... 🙂

I missed the --target-dir.

avatar
Master Guru

Cool, but you accepted the wrong answer 🙂

avatar
Master Guru

One solution is to remove "--hive-import" for incremental imports (after the first one -- what you already tried) and also to set "--target-dir /user/import/memberimport". In this way you are adding new entries to the Hive table's directory and therefore the Hive table will be updated as expected.