Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hadoop map-reduce indexoutofbounds

Hadoop map-reduce indexoutofbounds

New Contributor

My program was running fine for lesser inputs but when i increase the size of the input it seems that line 210 (context.nextKeyValue();) throws indexoutofbounds exception. This below is the setup method of the mapper. I call nextkeyvalue in there once because the first line of each file is a header. Splitting files is set to false because of the headers. Does it have to do with memory? how to solve this?

@Override
    protected void setup(Context context) throws IOException, InterruptedException
    {
        Configuration conf = context.getConfiguration();
        DupleSplit fileSplit = (DupleSplit)context.getInputSplit();
        //first line is header. Indicates the first digit of the solution. 
        context.nextKeyValue(); <---- LINE 210
        URI[] uris = context.getCacheFiles();

        int num_of_colors = Integer.parseInt(conf.get("num_of_colors"));
        int order = fileSplit.get_order();
        int first_digit = Integer.parseInt(context.getCurrentValue().toString());

        //perm_path = conf.get(Integer.toString(num_of_colors - order -1));
        int offset = Integer.parseInt(conf.get(Integer.toString(num_of_colors - order -1)));
        uri = uris[offset];
        Path perm_path = new Path(uri.getPath());
            perm_name = perm_path.getName().toString();

        String pair_variables = "";
        for (int i=1; i<=num_of_colors; i++)
            pair_variables += "X_" + i + "_" + (num_of_colors - order) + "\t";
        for (int i=1; i<num_of_colors; i++)
            pair_variables += "X_" + i + "_" + (num_of_colors - order - first_digit) + "\t";
        pair_variables += "X_" + num_of_colors + "_" + (num_of_colors - order - first_digit);
        context.write(new Text(pair_variables), null);
    }

Here's the error log:

Error: java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkBounds(Buffer.java:559)
at java.nio.ByteBuffer.get(ByteBuffer.java:668)
at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:279)
at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:168)
at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:775)
at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:831)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:891)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:59)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:91)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:144)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:184)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at produce_data_hdfs$input_mapper.setup(produce_data_hdfs.java:210)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
3 REPLIES 3

Re: Hadoop map-reduce indexoutofbounds

Rising Star

Which version are you running? There is a known bug in 5.7 that can cause this issue if the split is big enough (> 4GB I think).

Re: Hadoop map-reduce indexoutofbounds

New Contributor

I'm using hadoop 2.7.1.

Just checked the files. One of them is 4.2 GB. Another comes close at 3.6 GB. The rest are below 3.

If it is this bug, maybe i could upgrade to 2.7.2? Would it solve it?

Re: Hadoop map-reduce indexoutofbounds

Rising Star

I guess it is the 4.2 GB file that have triggered the bug. The fix is in 2.7.3 or 2.8.0