Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

sqoop incremental import in hive i get error message hive not support append mode how to solve that

Solved Go to solution

sqoop incremental import in hive i get error message hive not support append mode how to solve that

Contributor

sqoop incremental import

i got this below error ,how can i solve that i am using hortonworks 2.3

Append mode for hive imports is not yet supported. Please remove the parameter --append-mode

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: sqoop incremental import in hive i get error message hive not support append mode how to solve that

Hi @sivasaravanakumar k, yes you are write, sqoop indeed says that "Append mode for hive imports is not yet supported". However, it can be done by incremental import to HDFS and mapping your Hive table to sqoop's target-dir. A full example is attached, here are the highlights:

  • Define your Hive table as external table
CREATE EXTERNAL TABLE h2 (id int, name STRING, ts TIMESTAMP) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE location '/user/it1/sqin5';
  • Initially it's empty. Do the first sqoop, importing 5000 entries from MySql table st1 (having id's set to 1-5000), setting target-dir to the location of our external table /user/it1/sqin5
sqoop import --connect jdbc:mysql://localhost:3306/test --driver com.mysql.jdbc.Driver --username it1 --password hadoop --table st1 --target-dir /user/it1/sqin5 -m 1 --incremental append -check-column id
16/01/25 13:36:07 INFO tool.ImportTool: Upper bound value: 5000
16/01/25 13:36:27 INFO mapreduce.ImportJobBase: Retrieved 5000 records.
  • If you check now in Hive, table h2 has 5000 entries. Now append 900 entries to MySql table st1, 5100<=id<6000 and do incremental append import setting last-value to 5000.
sqoop import --connect jdbc:mysql://localhost:3306/test --driver com.mysql.jdbc.Driver --username it1 --password hadoop --table st1 --target-dir /user/it1/sqin5 -m 1 --incremental append -check-column id --last-value 5000
16/01/25 13:38:23 INFO tool.ImportTool: Lower bound value: 5000
16/01/25 13:38:23 INFO tool.ImportTool: Upper bound value: 5999
16/01/25 13:38:47 INFO mapreduce.ImportJobBase: Retrieved 900 records.
  • If you check now Hive table h2 has 5900 entries.
hive> select count(*) from h2;
5900
  • In the same way you can also handle Sqoop incremental imports into Hive based on "lastmodified" and "merge-key".
  • You can also create a Sqoop job like in your other question and let Sqoop take care of last-value.
  • You can import into Hive local (non-external) tables by setting Sqoop target-dir to /apps/hive/warehouse/<table-name>. [That's what Sqoop does when using "--hive-import"]

View solution in original post

5 REPLIES 5
Highlighted

Re: sqoop incremental import in hive i get error message hive not support append mode how to solve that

@sivasaravanakumar k

You have this ..Doc

  • sqoop import --connect jdbc:mysql://db.foo.com/bar --table EMPLOYEES --where "id > 100000" --target-dir /incremental_dataset --append
Highlighted

Re: sqoop incremental import in hive i get error message hive not support append mode how to solve that

Contributor

i got this worning message

Append mode for hive imports is not yet supported. Please remove the parameter --append-mode

Highlighted

Re: sqoop incremental import in hive i get error message hive not support append mode how to solve that

Hi @sivasaravanakumar k, yes you are write, sqoop indeed says that "Append mode for hive imports is not yet supported". However, it can be done by incremental import to HDFS and mapping your Hive table to sqoop's target-dir. A full example is attached, here are the highlights:

  • Define your Hive table as external table
CREATE EXTERNAL TABLE h2 (id int, name STRING, ts TIMESTAMP) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE location '/user/it1/sqin5';
  • Initially it's empty. Do the first sqoop, importing 5000 entries from MySql table st1 (having id's set to 1-5000), setting target-dir to the location of our external table /user/it1/sqin5
sqoop import --connect jdbc:mysql://localhost:3306/test --driver com.mysql.jdbc.Driver --username it1 --password hadoop --table st1 --target-dir /user/it1/sqin5 -m 1 --incremental append -check-column id
16/01/25 13:36:07 INFO tool.ImportTool: Upper bound value: 5000
16/01/25 13:36:27 INFO mapreduce.ImportJobBase: Retrieved 5000 records.
  • If you check now in Hive, table h2 has 5000 entries. Now append 900 entries to MySql table st1, 5100<=id<6000 and do incremental append import setting last-value to 5000.
sqoop import --connect jdbc:mysql://localhost:3306/test --driver com.mysql.jdbc.Driver --username it1 --password hadoop --table st1 --target-dir /user/it1/sqin5 -m 1 --incremental append -check-column id --last-value 5000
16/01/25 13:38:23 INFO tool.ImportTool: Lower bound value: 5000
16/01/25 13:38:23 INFO tool.ImportTool: Upper bound value: 5999
16/01/25 13:38:47 INFO mapreduce.ImportJobBase: Retrieved 900 records.
  • If you check now Hive table h2 has 5900 entries.
hive> select count(*) from h2;
5900
  • In the same way you can also handle Sqoop incremental imports into Hive based on "lastmodified" and "merge-key".
  • You can also create a Sqoop job like in your other question and let Sqoop take care of last-value.
  • You can import into Hive local (non-external) tables by setting Sqoop target-dir to /apps/hive/warehouse/<table-name>. [That's what Sqoop does when using "--hive-import"]

View solution in original post

Highlighted

Re: sqoop incremental import in hive i get error message hive not support append mode how to solve that

Mentor

@sivasaravanakumar k are you still having issues with this? Can you accept best answer or provide your workaround?

Highlighted

Re: sqoop incremental import in hive i get error message hive not support append mode how to solve that

@sivasaravanakumar k Another way of doing incremental import can be is to do each incremental as a separate partition of Hive table.

First create an external partitioned table

CREATE EXTERNAL TABLE h2 (id int, name STRING, ts TIMESTAMP) PARTITIONED BY (pt=string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE location '/user/it1/sqin5';

Then sqoop data to external partitions by specifying HDFS external location.

sqoop import--connect jdbc:mysql://localhost:3306/test --driver com.mysql.jdbc.Driver --username it1 --password hadoop --table st1 --where "where id < 5000" --target-dir /user/it1/sqin5/pt=0 -m 1 

Add partition to Hive table

alter table h2 add PARTITION(pt=0) LOCATION '/user/it1/sqin5/pt=0';

Verify table count. It should return 5000 rows. Now, run an incremental sqoop by specifying appropriate where clause

sqoop import--connect jdbc:mysql://localhost:3306/test --driver com.mysql.jdbc.Driver --username it1 --password hadoop --table st1 --where "where id > 5000 and id < 10000" --target-dir /user/it1/sqin5/pt=1 -m 1 

Add 2nd partition to Hive table.

alter table h2 add PARTITION(pt=1) LOCATION '/user/it1/sqin5/pt=1';

Verify table count. It should return 10,000 rows.

Don't have an account?
Coming from Hortonworks? Activate your account here