Created 12-03-2015 08:58 PM
I am attempting to use SQOOP on a HANA tables of size 180 TB (compressed, 800TB on disk) into a HIVE table. When I pass LIMIT in query argument, the number of rows I get is 4 times the amount passed as LIMIT. So 250 LIMIT fetched 1000 rows. And they are not duplicated.
Another issue I am facing is with fetch-size. When I pass the fetch size, the process errors out with the message, "Search Limit exceeded"
Created 12-04-2015 01:03 AM
sqoop uses 4 mappers by default. Try running with option -m 1 or any other number to see if it makes the difference.
Copying following line from this as it does make sense.
Using the "top x" or "limit x" clauses do not make much sense with Sqoop as it can return different values on each query execution (there is no "order by"). Also in addition the clause will very likely confuse split generation, ending with not that easily deterministic outputs. Having said that I would recommend you to use only 1 mapper (-m 1 or --num-mappers 1) in case that you need to import predefined number of rows
Created 12-04-2015 01:03 AM
sqoop uses 4 mappers by default. Try running with option -m 1 or any other number to see if it makes the difference.
Copying following line from this as it does make sense.
Using the "top x" or "limit x" clauses do not make much sense with Sqoop as it can return different values on each query execution (there is no "order by"). Also in addition the clause will very likely confuse split generation, ending with not that easily deterministic outputs. Having said that I would recommend you to use only 1 mapper (-m 1 or --num-mappers 1) in case that you need to import predefined number of rows
Created 12-04-2015 04:40 PM
Yes, that solved the problem. Thanks!