Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sqoop GC overhead limit exceeded after CDH5.2 update

SOLVED Go to solution

Sqoop GC overhead limit exceeded after CDH5.2 update

New Contributor

Hi,

 

we updated sqoop from cdh5.0.1 to cdh5.2 and now it fails everytime with a GC overhead limit exceeded error.

The old version was able to import over 14GB of data over one mapper and the import fails now when a mapper gets too many rows. I checked a heap dump and the memory was completely used by over 3.5 million rows of data (-Xmx 1700M).

The connector is mysql-jdbc version 5.1.33 and the job imports the data as text file in a have table.

 

Can I avoid this with a setting or is this a bug that should go to jira?

 

Thank you,

Jürgen

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Sqoop GC overhead limit exceeded after CDH5.2 update

New Contributor

This appears to be a regression caused by the fix in SQOOP-1400. Instead of fetching results from MySQL row-by-row, sqoop is instead attempting to load the entire result set in memory.

 

We worked around it by upgrading to MySQL/J Connector 5.1.33 (which you're already on), and then including "--fetch-size -2147483648" in our sqoop command line options list. This restores the old row-by-row behaviour (the weird fetch size is a sentinel value recognised by the MySQL JDBC driver.)

5 REPLIES 5

Re: Sqoop GC overhead limit exceeded after CDH5.2 update

New Contributor

This appears to be a regression caused by the fix in SQOOP-1400. Instead of fetching results from MySQL row-by-row, sqoop is instead attempting to load the entire result set in memory.

 

We worked around it by upgrading to MySQL/J Connector 5.1.33 (which you're already on), and then including "--fetch-size -2147483648" in our sqoop command line options list. This restores the old row-by-row behaviour (the weird fetch size is a sentinel value recognised by the MySQL JDBC driver.)

Re: Sqoop GC overhead limit exceeded after CDH5.2 update

New Contributor

Thanks for the answer!

 

I also found the workaround after some time, but you were faster to post it. I'll open a Jira for it, that it will be fixed in new versions.

Re: Sqoop GC overhead limit exceeded after CDH5.2 update

New Contributor

Use ?dontTrackOpenResources=true&defaultFetchSize=1000&useCursorFetch=true property in Mysql Connection string. It work without changing JVM parameter.

Highlighted

Re: Sqoop GC overhead limit exceeded after CDH5.2 update

New Contributor

@Snd wrote:

Use ?dontTrackOpenResources=true&defaultFetchSize=1000&useCursorFetch=true property in Mysql Connection string. It work without changing JVM parameter.


 

Thank you! it's worked!!!!

Re: Sqoop GC overhead limit exceeded after CDH5.2 update

New Contributor
Thanks a lot! It worked forme also