Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sqoop export to Oracle | How to change default fetch size value run time

Highlighted

Sqoop export to Oracle | How to change default fetch size value run time

New Contributor

Hi Experts, I am sqooping out a Hive table with 20-30 columns and apprx. 100 millions records to Oracle.

sqoop export -D mapred.job.queue.name=<yarn queue name> -Dsqoop.export.records.per.statement=100000 --connect jdbc:oracle:thin:<db server> --username xxxx --P --table employee --hcatalog-database default --hcatalog-table employee --num-mappers 10 --batch
Source Hive table is unpartitioned and has 200 part files, total 1.1 GB of data. And destination oracle table has identity defined on pk column which has cache of 100000.

It's taking approx 5-6 minutes to sqoop out data completely with default fetch size 1000. Any suggestion how can I improve more to get better performance. Is there anyway I can change fetch size?

Thanks in advance.

2 REPLIES 2

Re: Sqoop export to Oracle | How to change default fetch size value run time

Expert Contributor
@Aniruddha Ghosh

1. The table in question is it a text-based table to is it of a different format.

2. Have you tried increasing the number of mappers?

Also as you are trying to load the data into oracle, have you tried enabling --direct -> used for Data Connector for Oracle which does a fast import and export

If you have not tried these, I would suggest you to try with these options.

Re: Sqoop export to Oracle | How to change default fetch size value run time

New Contributor

Hi @Mahesh

Thanks for your reply...

1. Table is in orc format

2. Yes I tried increasing number of mappers. Too many mappers causing Oracle side locking, too many insert operation on the same block.

I tried using -Dsqoop.export.records.per.statement and -Dsqoop.export.statements.per.transaction with --batch and --direct but seems it doesn't help that much, --fetch-size didn't work with export

Thanks!