Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

SQOOP HANA to HIVE ORC

avatar
Guru

I am attempting to use SQOOP on a HANA tables of size 180 TB (compressed, 800TB on disk) into a HIVE table. When I pass LIMIT in query argument, the number of rows I get is 4 times the amount passed as LIMIT. So 250 LIMIT fetched 1000 rows. And they are not duplicated.

Another issue I am facing is with fetch-size. When I pass the fetch size, the process errors out with the message, "Search Limit exceeded"

1 ACCEPTED SOLUTION

avatar
Master Mentor
@Vedant Jain

sqoop uses 4 mappers by default. Try running with option -m 1 or any other number to see if it makes the difference.

Copying following line from this as it does make sense.

Using the "top x" or "limit x" clauses do not make much sense with Sqoop as it can return different values on each query execution (there is no "order by"). Also in addition the clause will very likely confuse split generation, ending with not that easily deterministic outputs. Having said that I would recommend you to use only 1 mapper (-m 1 or --num-mappers 1) in case that you need to import predefined number of rows

View solution in original post

2 REPLIES 2

avatar
Master Mentor
@Vedant Jain

sqoop uses 4 mappers by default. Try running with option -m 1 or any other number to see if it makes the difference.

Copying following line from this as it does make sense.

Using the "top x" or "limit x" clauses do not make much sense with Sqoop as it can return different values on each query execution (there is no "order by"). Also in addition the clause will very likely confuse split generation, ending with not that easily deterministic outputs. Having said that I would recommend you to use only 1 mapper (-m 1 or --num-mappers 1) in case that you need to import predefined number of rows

avatar
Guru

Yes, that solved the problem. Thanks!