My program was running fine for lesser inputs but when i increase the size of the input it seems that line 210 (context.nextKeyValue();) throws indexoutofbounds exception. This below is the setup method of the mapper. I call nextkeyvalue in there once because the first line of each file is a header. Splitting files is set to false because of the headers. Does it have to do with memory? how to solve this?
@Override protected void setup(Context context) throws IOException, InterruptedException { Configuration conf = context.getConfiguration(); DupleSplit fileSplit = (DupleSplit)context.getInputSplit(); //first line is header. Indicates the first digit of the solution. context.nextKeyValue(); <---- LINE 210 URI[] uris = context.getCacheFiles(); int num_of_colors = Integer.parseInt(conf.get("num_of_colors")); int order = fileSplit.get_order(); int first_digit = Integer.parseInt(context.getCurrentValue().toString()); //perm_path = conf.get(Integer.toString(num_of_colors - order -1)); int offset = Integer.parseInt(conf.get(Integer.toString(num_of_colors - order -1))); uri = uris[offset]; Path perm_path = new Path(uri.getPath()); perm_name = perm_path.getName().toString(); String pair_variables = ""; for (int i=1; i<=num_of_colors; i++) pair_variables += "X_" + i + "_" + (num_of_colors - order) + "\t"; for (int i=1; i<num_of_colors; i++) pair_variables += "X_" + i + "_" + (num_of_colors - order - first_digit) + "\t"; pair_variables += "X_" + num_of_colors + "_" + (num_of_colors - order - first_digit); context.write(new Text(pair_variables), null); }
Here's the error log:
Error: java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkBounds(Buffer.java:559) at java.nio.ByteBuffer.get(ByteBuffer.java:668) at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:279) at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:168) at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:775) at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:831) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:891) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:59) at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174) at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:91) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:144) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:184) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at produce_data_hdfs$input_mapper.setup(produce_data_hdfs.java:210) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143