Support Questions
Find answers, ask questions, and share your expertise

Is there a maximum volume of data that can be imported using SQOOP?

Highlighted

Is there a maximum volume of data that can be imported using SQOOP?

New Contributor

Hi,

I'm looking to migrate 15 terrabytes of data into Hadoop and considereing FTP or SQOOP. Can anyone advise on the maximum volumes that SQOOP can handle as I've been told that its not normally used above 10Gb.

Thanks

Leigh

7 REPLIES 7
Highlighted

Re: Is there a maximum volume of data that can be imported using SQOOP?

@Leigh Perkins

The main question is "What is the source of data?"

if it's RDBMS then sqoop and answer is Yes..You can leverage sqoop to load 15TB of data

If it's not RDBMS then you should look into NiFi or Flume or if you just want to load data into HDFS then webhdfs

Highlighted

Re: Is there a maximum volume of data that can be imported using SQOOP?

New Contributor

Thanks. The migration is from Oracle so sounds like SQOOP will work fine.

Highlighted

Re: Is there a maximum volume of data that can be imported using SQOOP?

@Leigh Perkins Yes ..Now, most critical component is "Let your DBA know about this"

You are gold with Sqoop and Oracle marriage in this use case :)

Highlighted

Re: Is there a maximum volume of data that can be imported using SQOOP?

@Leigh Perkins Also, make sure that you designed your Hadoop cluster accordingly...storage and memory

http://www.slideshare.net/alxslva/effective-sqoop-best-practices-pitfalls-and-lessons-40370936

Highlighted

Re: Is there a maximum volume of data that can be imported using SQOOP?

Mentor
@Leigh Perkins

limitation is on the database side not on sqoop

Re: Is there a maximum volume of data that can be imported using SQOOP?

New Contributor

Perfect, thanks.

Highlighted

Re: Is there a maximum volume of data that can be imported using SQOOP?

Mentor

@Leigh Perkins make sure to limit batches otherwise you will kill your DB