Created on 03-18-2017 03:53 AM - edited 08-18-2019 04:34 AM
From mysql to hdfs directory.
sqoop import --connect jdbc:mysql://localhost/hadoopdb --username smas --password MyNewPass --table emp1 -m 1 --target-dir /data_new7 --incremental append --check-column id -last-value 2
i have /date_new7/part-m-00000 also it didnot work ?
how to make sure that part-m-00000 is updated with 3rd row or id .
it is updating as a seperate table ? any suggestion ?
Created 03-18-2017 08:22 AM
It worked. part-m-00001 is not a separate table, it's just another file in your import directory. If you create an external table on /date_new7, Hive will see a single table with 3 rows. Ditto for Map-reduce jobs taking /date_new7 as their input. If you end up with many small files you can merge them into one (from time to time) by using for example hadoop-streaming, see this example and set "mapreduce.job.reduces=1".
Created 03-18-2017 08:22 AM
It worked. part-m-00001 is not a separate table, it's just another file in your import directory. If you create an external table on /date_new7, Hive will see a single table with 3 rows. Ditto for Map-reduce jobs taking /date_new7 as their input. If you end up with many small files you can merge them into one (from time to time) by using for example hadoop-streaming, see this example and set "mapreduce.job.reduces=1".