Support Questions

mithleshdb8 · ‎03-18-2017

From mysql to hdfs directory.

sqoop import --connect jdbc:mysql://localhost/hadoopdb --username smas --password MyNewPass --table emp1 -m 1 --target-dir /data_new7 --incremental append --check-column id -last-value 2

i have /date_new7/part-m-00000 also it didnot work ?

how to make sure that part-m-00000 is updated with 3rd row or id .

it is updating as a seperate table ? any suggestion ?

pminovic · ‎03-18-2017

It worked. part-m-00001 is not a separate table, it's just another file in your import directory. If you create an external table on /date_new7, Hive will see a single table with 3 rows. Ditto for Map-reduce jobs taking /date_new7 as their input. If you end up with many small files you can merge them into one (from time to time) by using for example hadoop-streaming, see this example and set "mapreduce.job.reduces=1".

View solution in original post

pminovic · ‎03-18-2017

It worked. part-m-00001 is not a separate table, it's just another file in your import directory. If you create an external table on /date_new7, Hive will see a single table with 3 rows. Ditto for Map-reduce jobs taking /date_new7 as their input. If you end up with many small files you can merge them into one (from time to time) by using for example hadoop-streaming, see this example and set "mapreduce.job.reduces=1".

Cloudera Community

Support Questions

incremental load from mysql to hdfs in hadoop

Incrementally Streaming RDBMS Data to Your Hadoop ...

FOUR STEP STRATEGY FOR INCREMENTAL UPDATES IN APAC...

Hadoop and LDAP: Usage, Load Patterns and Tuning

HDFS Settings for Better Hadoop Performance

Tactical modularity in CDE Airflow by loading code...

sqoop incremental load failed with timestamp

Connection refused - when loading data from MySQL ...

Cache Aware Load Balancer in Apache HBase

Load Iceberg Table on PowerBI Desktop

Hadoop Cluster Maintenance