Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Incrimental update in HIVE table using sqoop

avatar
Contributor

Hi All

I have a table in oracle with only 4 columns...

Memberid --- bigint

uuid --- String

insertdate --- date

updatedate --- date

I want to import those data in HIVE table using sqoop. I create corresponding HIVE table with

create  EXTERNAL TABLE memberimport(memberid BIGINT,uuid varchar(36),insertdate timestamp,updatedate timestamp)LOCATION '/user/import/memberimport';

and sqoop command

sqoop import --connect jdbc:oracle:thin:@dbURL:1521/dbName --username ** --password *** --hive-import --table MEMBER --columns 'MEMBERID,UUID,INSERTDATE,UPDATEDATE' --map-column-hive MEMBERID=BIGINT,UUID=STRING,INSERTDATE=TIMESTAMP,UPDATEDATE=TIMESTAMP --hive-table memberimport -m 1   

Its working properly and able to import data in HIVE table.

Now I want to update this table with incremental update with updatedate (last value today's date) so that I can get day to day update for that OLTP table into my HIVE table using sqoop.

For Incremental import I am using following sqoop command

sqoop import --hive-import --connect jdbc:oracle:thin:@dbURL:1521/dbName --username *** --password *** --table MEMBER --check-column UPDATEDATE --incremental append --columns 'MEMBERID,UUID,INSERTDATE,UPDATEDATE' --map-column-hive MEMBERID=BIGINT,UUID=STRING,INSERTDATE=TIMESTAMP,UPDATEDATE=TIMESTAMP --hive-table memberimport -m 1

But I am getting exception

"Append mode for hive imports is not yet supported. Please remove the parameter --append-mode" and as per documentation in sqoop "--hive-import" does not support with "--incremental append"

When I remove the "--hive-import" it run properly and file in getting created in sqoop location in HDFS but I did not found those new update in HIVE table that I have in OLTP table.

When I run....

"select * from hive-table "

It is showing the old data only no incremental data.

Am I doing anything wrong ? Please suggest me how can I run incremental update with Oracle - Hive using sqoop.

Any help will be appropriated..

Thanks in Advance ...

1 ACCEPTED SOLUTION

avatar
Master Guru

One solution is to remove "--hive-import" for incremental imports (after the first one -- what you already tried) and also to set "--target-dir /user/import/memberimport". In this way you are adding new entries to the Hive table's directory and therefore the Hive table will be updated as expected.

View solution in original post

4 REPLIES 4

avatar

After the initial import, subsequent imports can leverage SQOOP’s native support for “Incremental Import” by using the “check-column”, “incremental” and “last-value” parameters.

http://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/

avatar
Contributor

Thanks a lot .... You save my day .... 🙂

I missed the --target-dir.

avatar
Master Guru

Cool, but you accepted the wrong answer 🙂

avatar
Master Guru

One solution is to remove "--hive-import" for incremental imports (after the first one -- what you already tried) and also to set "--target-dir /user/import/memberimport". In this way you are adding new entries to the Hive table's directory and therefore the Hive table will be updated as expected.