sqoop import \
--connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \
--username root \
--password cloudera \
--table mytest \
--boundary-query 'select min(id), max(id) from mytest' \ --warehouse-dir /user/cloudera/sqoop_import \ --m 1.
My table mytest is having only one column "id" and i have 10 records starting from 1 ending with 10. But after executing the script i am getting zero records in my output hdfs file. I am also receiving the below error
ERROR tool.ImportTool: Error during import: Error validating row counts
Please help me
I guess as a result, this one column will be split by Sqoop as two different columns and hence the exception you are getting. I would suggest to clean up your data. and try to use parameters --enclosed-by or --escaped-by to overcome this issue.
The error message just tell you that you didn't get all the records you reqeuested in the destination. Have you run the Sqoop command correct if you remove the boundary-query option. I don't see any benefit of boundary-query at this moment if you are using only one mapper. If the command give the correct answer, then you can further test on boundary-query after that.
I assume that in the last line of your sqoop command you just skipped new lines in a post but not in actual statement. There is also an error in the statement: "--m" - there is no such argument for sqoop command. You can use either "-m" (single dash) or "--num-mappers". Maybe that's what causing the error you are getting. Also, as Frank mentioned - if you provide boundaries for split, have at least two mappers allocated then, or avoid "boundary-query" or "split-by" arguments.