Created 02-12-2017 12:38 PM
I am getting JAVA Heap space error in my reducer phase .I have used 41 reducer in my application and also Custom Partitioner class . Below is my reducer code that throws below error .
Here is my reducer code..
public class MyReducer extends Reducer<NullWritable, Text, NullWritable, Text> { private Logger logger = Logger.getLogger(MyReducer.class); StringBuilder sb = new StringBuilder(); private MultipleOutputs<NullWritable, Text> multipleOutputs; public void setup(Context context) { logger.info("Inside Reducer."); multipleOutputs = new MultipleOutputs<NullWritable, Text>(context); } @Override public void reduce(NullWritable Key, Iterable<Text> values, Context context) throws IOException, InterruptedException { for (Text value : values) { final String valueStr = value.toString(); if (valueStr.contains("Japan")) { sb.append(valueStr.substring(0, valueStr.length() - 20)); } else if (valueStr.contains("SelfSourcedPrivate")) { sb.append(valueStr.substring(0, valueStr.length() - 29)); } else if (valueStr.contains("SelfSourcedPublic")) { sb.append(value.toString().substring(0, valueStr.length() - 29)); } else if (valueStr.contains("ThirdPartyPrivate")) { sb.append(valueStr.substring(0, valueStr.length() - 25)); } } multipleOutputs.write(NullWritable.get(), new Text(sb.toString()), "MyFileName"); } public void cleanup(Context context) throws IOException, InterruptedException { multipleOutputs.close(); } }
17/02/1205:26:45 INFO mapreduce.Job: map 98% reduce 0%17/02/1205:28:02 INFO mapreduce.Job: map 100% reduce 0%17/02/1205:28:09 INFO mapreduce.Job: map 100% reduce 17%17/02/1205:28:10 INFO mapreduce.Job: map 100% reduce 39%17/02/1205:28:11 INFO mapreduce.Job: map 100% reduce 46%17/02/1205:28:12 INFO mapreduce.Job: map 100% reduce 51%17/02/1205:28:13 INFO mapreduce.Job: map 100% reduce 54%17/02/1205:28:14 INFO mapreduce.Job: map 100% reduce 56%17/02/1205:28:15 INFO mapreduce.Job: map 100% reduce 88%17/02/1205:28:17/02/12 06:21:45 INFO mapreduce.Job: map 78% reduce 0% 17/02/12 06:21:46 INFO mapreduce.Job: map 82% reduce 0% 17/02/12 06:21:47 INFO mapreduce.Job: map 85% reduce 0% 17/02/12 06:21:48 INFO mapreduce.Job: map 87% reduce 0% 17/02/12 06:21:49 INFO mapreduce.Job: map 88% reduce 0% 17/02/12 06:21:50 INFO mapreduce.Job: map 93% reduce 0% 17/02/12 06:21:51 INFO mapreduce.Job: map 94% reduce 0% 17/02/12 06:21:53 INFO mapreduce.Job: map 95% reduce 0% 17/02/12 06:21:58 INFO mapreduce.Job: map 96% reduce 0% 17/02/12 06:21:59 INFO mapreduce.Job: map 97% reduce 0% 17/02/12 06:22:02 INFO mapreduce.Job: map 98% reduce 0% 17/02/12 06:23:46 INFO mapreduce.Job: map 99% reduce 0% 17/02/12 06:23:50 INFO mapreduce.Job: map 100% reduce 0% 17/02/12 06:23:54 INFO mapreduce.Job: map 100% reduce 12% 17/02/12 06:23:55 INFO mapreduce.Job: map 100% reduce 32% 17/02/12 06:23:56 INFO mapreduce.Job: map 100% reduce 46% 17/02/12 06:23:57 INFO mapreduce.Job: map 100% reduce 51% 17/02/12 06:23:58 INFO mapreduce.Job: map 100% reduce 54% 17/02/12 06:23:59 INFO mapreduce.Job: map 100% reduce 59% 17/02/12 06:24:00 INFO mapreduce.Job: map 100% reduce 88% 17/02/12 06:24:01 INFO mapreduce.Job: map 100% reduce 90% 17/02/12 06:24:03 INFO mapreduce.Job: map 100% reduce 93% 17/02/12 06:24:06 INFO mapreduce.Job: map 100% reduce 95% 17/02/12 06:24:06 INFO mapreduce.Job: Task Id : attempt_1486663266028_2715_r_000020_0, Status : FAILED Error: Java heap space 17/02/12 06:24:07 INFO mapreduce.Job: map 100% reduce 93% 17/02/12 06:24:10 INFO mapreduce.Job: Task Id : attempt_1486663266028_2715_r_000021_0, Status : FAILED Error: Java heap space 17/02/12 06:24:11 INFO mapreduce.Job: map 100% reduce 91% 17/02/12 06:24:11 INFO mapreduce.Job: Task Id : attempt_1486663266028_2715_r_000027_0, Status : FAILED Error: Java heap space 17/02/12 06:24:12 INFO mapreduce.Job: map 100% reduce 90%
Created 02-25-2017 03:18 AM
Finally i manged to resolve it .
I just usedmultipleOutputs.write(NullWritable.get(), new Text(sb.toString()),strName);
inside the for loop and that solved my problem .I have tested it with very huge data set 19 gb file and it worked fine for me . This is my final solution .Initially i thought it might create many objects but it is working fine for me .Map reduce is also getting competed very fast .
Created 02-12-2017 03:28 PM
Do you have a combiner? Can you try adding a combiner and see if that helps?
Created 02-12-2017 05:52 PM
you should avoid creating a StringBuilder in every reducer especially with 80000 as argument. In case you need to use that, move it out into setup method and initialize it there, that will force object reuse. Instead you're creating one SB for each reducer with initial capacity of 80000. How much do you expect each row to be? Surely not 80000 characters? My suggestion again is to move SB out of reducer() and let Java figure out the capacity instead of you passing such a large size. It will resize as needed.
StringBuilder sb = new StringBuilder();
You're also calling toString() method way too much, I recommend right after for loop
final String valueStr = value.toString();
then reference that valueStr variable instead of calling toString() so much.
Created 02-12-2017 06:01 PM
How can i change that .If i define StringBuilder in setup() method will it be accesible in the reduce () method.Do i have to pass as parameter ?
Created 02-12-2017 06:06 PM
@sudarshan kumar you can also just move it out of the method but leave it in the class
Created 02-13-2017 05:24 AM
Created 02-13-2017 11:47 AM
Is it still giving you the problem? It looks fine to me.
Created 02-13-2017 03:11 PM
@sudarshan kumar can you paste sample data, I'll try to produce code to help you.
Created 02-25-2017 03:18 AM
Finally i manged to resolve it .
I just usedmultipleOutputs.write(NullWritable.get(), new Text(sb.toString()),strName);
inside the for loop and that solved my problem .I have tested it with very huge data set 19 gb file and it worked fine for me . This is my final solution .Initially i thought it might create many objects but it is working fine for me .Map reduce is also getting competed very fast .