Hi,
I run a sqoop import in order to fetch data from a table in sql server. Inside the sqoop I have a query which fetches every 6 mins data from 2 hours before until now. The weird thing is that sqoop doesnt fetch all the data. It is somehow random how many data it fetches. My sqoop command is the below. For example in this 2 hours scanning I fetched a record 28 times. 19 times all the rows were align with the sql server but 9 they were half
sqoop import
--connect 'jdbc:sqlserver\
--username \
--password-alias \
--num-mappers 10 \
--split-by an_int \
--fields-terminated-by '|' \
--query "select * from table where timestamp > '${offset}' and \$CONDITIONS" \
--delete-target-dir \
--target-dir
The amount of the data for 2 hours is ~800k