Created 10-27-2015 08:55 AM
What is the purpose of the following two configuration parameters in mapred-size.xml? What are recommended values?
mapreduce.input.fileinputformat.split.minsize mapreduce.input.fileinputformat.split.maxsize
Thanks 🙂
Created 10-27-2015 12:41 PM
I found this really useful
Also, from Apache doc
Deprecated property name
mapred.min.split.sizeNew
mapreduce.input.fileinputformat.split.minsizeCreated 10-27-2015 12:41 PM
I found this really useful
Also, from Apache doc
Deprecated property name
mapred.min.split.sizeNew
mapreduce.input.fileinputformat.split.minsizeCreated 10-28-2015 11:16 AM
Thanks @Neeraj
I also found these two books:
And both are basically saying that mapreduce.input.fileinputformat.split.minsize < dfs.blocksize < ...maxsize
Smartsense recommended: 105MB (minsize) and 270MB (maxsize)
Our current block setting is 64MB, although Smartsense recommended 128MB blocksize, so it kind of fits the min/max recommendations as well as the descriptions from the books.
Created 10-28-2015 11:28 AM
Thanks for sharing 🙂 @Jonas Straub