09-26-2018 02:44 AM - last edited on 09-26-2018 05:41 AM by cjervis
In sqoop import how mapreduce works in key & value pair in rdbms tables with structure data?
09-27-2018 02:20 AM
10-07-2018 08:48 PM
In rdbms database block size is 8kb and in hadoop block size is 64MB. In sqoop import example my rdbms tables size is 300mb. So it will split into 5 mapper ? Please confirm
10-22-2018 05:49 PM
I think the default block size is 128 MB. But anyway this is not the factor that determine number of mapper for sqoop.
number of mapper depend on --num-mappers parameter you specify in sqoop import and you also need to mention the --split-by <column-name>. Based on column name you provided sqoop will find the min and max value and divide it by --num-mappers. Is best to use primary key as the split-by column or any column which has high cardinality to ensure your mappers are balanced.