- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
SQOOP EXPORT error - Larger than maximum split size error
- Labels:
-
Apache Sqoop
Created 11-15-2018 08:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am getting the below error while doing SQOOP EXPORT.
java.io.IOException: Minimum split size pernode 536870912 cannot be larger than maximum split size 41at org.apache.sqoop.mapreduce.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:200) at org.apache.sqoop.mapreduce.ExportInputFormat.getSplits(ExportInputFormat.java:73) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:597) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:614) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:492) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1306) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1303) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1303) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1324) at org.apache.sqoop.mapreduce.ExportJobBase.doSubmitJob(ExportJobBase.java:324) at org.apache.sqoop.mapreduce.ExportJobBase.runJob(ExportJobBase.java:301) at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:442) at org.apache.sqoop.manager.SqlManager.exportTable(SqlManager.java:931) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:80) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
I removed the most of the fields and renamed the table to some other name due to project compliance. Below I mentioned the Sqoop Export I used
create table sup_api_bidder_test( id int, name string, vendor_id bigint)
Created 11-16-2018 12:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You want to check the mapreduce min split and max split size. From the message it seems like the min split size is larger than the max split size.
Created 11-18-2018 10:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
After trying multiple options. I fixed the issue by giving the below options in SQOOP Export. I fixed the per node min split size, min rack size. Then the job ran successfully.
sqoop export \ -Dmapreduce.input.fileinputformat.split.minsize.per.rack=749983 \ -Dmapreduce.input.fileinputformat.split.minsize.per.node=749983 \ --connect jdbc:mysql://01-mysql-test232855.envnxs.net:3306/retail_export \ --username autoenv_root \ --export-dir /user/hive/warehouse/retail_db.db/orders \ -table orders\ -P
Created 11-18-2018 10:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
After trying multiple options finally fixed this issue to passing per node and rack min sizes. I passed the values are parms to Sqoop like below.
sqoop export \ -Dmapreduce.input.fileinputformat.split.minsize.per.rack=749983 \ -Dmapreduce.input.fileinputformat.split.minsize.per.node=749983 \ --connect jdbc:mysql://01-mysql-test232855.envnxs.net:3306/retail_export \ --username autoenv_root \ --export-dir /user/hive/warehouse/retail_db.db/orders \ -table orders\ -P