Support Questions

Find answers, ask questions, and share your expertise

Sqoop export to Oracle | How to change default fetch size value run time

New Contributor

Hi Experts, I am sqooping out a Hive table with 20-30 columns and apprx. 100 millions records to Oracle.

sqoop export -D<yarn queue name> -Dsqoop.export.records.per.statement=100000 --connect jdbc:oracle:thin:<db server> --username xxxx --P --table employee --hcatalog-database default --hcatalog-table employee --num-mappers 10 --batch
Source Hive table is unpartitioned and has 200 part files, total 1.1 GB of data. And destination oracle table has identity defined on pk column which has cache of 100000.

It's taking approx 5-6 minutes to sqoop out data completely with default fetch size 1000. Any suggestion how can I improve more to get better performance. Is there anyway I can change fetch size?

Thanks in advance.


Expert Contributor
@Aniruddha Ghosh

1. The table in question is it a text-based table to is it of a different format.

2. Have you tried increasing the number of mappers?

Also as you are trying to load the data into oracle, have you tried enabling --direct -> used for Data Connector for Oracle which does a fast import and export

If you have not tried these, I would suggest you to try with these options.

New Contributor

Hi @Mahesh

Thanks for your reply...

1. Table is in orc format

2. Yes I tried increasing number of mappers. Too many mappers causing Oracle side locking, too many insert operation on the same block.

I tried using -Dsqoop.export.records.per.statement and -Dsqoop.export.statements.per.transaction with --batch and --direct but seems it doesn't help that much, --fetch-size didn't work with export


Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.