Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Sqoop GC overhead limit exceeded after CDH5.2 update

avatar

Hi,

 

we updated sqoop from cdh5.0.1 to cdh5.2 and now it fails everytime with a GC overhead limit exceeded error.

The old version was able to import over 14GB of data over one mapper and the import fails now when a mapper gets too many rows. I checked a heap dump and the memory was completely used by over 3.5 million rows of data (-Xmx 1700M).

The connector is mysql-jdbc version 5.1.33 and the job imports the data as text file in a have table.

 

Can I avoid this with a setting or is this a bug that should go to jira?

 

Thank you,

Jürgen

1 ACCEPTED SOLUTION

avatar
New Contributor

This appears to be a regression caused by the fix in SQOOP-1400. Instead of fetching results from MySQL row-by-row, sqoop is instead attempting to load the entire result set in memory.

 

We worked around it by upgrading to MySQL/J Connector 5.1.33 (which you're already on), and then including "--fetch-size -2147483648" in our sqoop command line options list. This restores the old row-by-row behaviour (the weird fetch size is a sentinel value recognised by the MySQL JDBC driver.)

View solution in original post

5 REPLIES 5

avatar
New Contributor

This appears to be a regression caused by the fix in SQOOP-1400. Instead of fetching results from MySQL row-by-row, sqoop is instead attempting to load the entire result set in memory.

 

We worked around it by upgrading to MySQL/J Connector 5.1.33 (which you're already on), and then including "--fetch-size -2147483648" in our sqoop command line options list. This restores the old row-by-row behaviour (the weird fetch size is a sentinel value recognised by the MySQL JDBC driver.)

avatar

Thanks for the answer!

 

I also found the workaround after some time, but you were faster to post it. I'll open a Jira for it, that it will be fixed in new versions.

avatar
New Contributor

Use ?dontTrackOpenResources=true&defaultFetchSize=1000&useCursorFetch=true property in Mysql Connection string. It work without changing JVM parameter.

avatar
New Contributor

@Snd wrote:

Use ?dontTrackOpenResources=true&defaultFetchSize=1000&useCursorFetch=true property in Mysql Connection string. It work without changing JVM parameter.


 

Thank you! it's worked!!!!

avatar
New Contributor
Thanks a lot! It worked forme also