Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Exception while while writing to S3n in Hadoop Job

Highlighted

Exception while while writing to S3n in Hadoop Job

Explorer

Hello,

Reducers are failing in Hadoop job while writing o/p to S3n.

Each reducer is producing ~1 GB of data to write to S3.

Error: java.io.IOException: Exception occured in one of the callables at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3MultiPartFsOutputStream.handleMultipartExceptions(NativeS3FileSystem.java:709) at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3MultiPartFsOutputStream.multipartWrite(NativeS3FileSystem.java:677) at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3MultiPartFsOutputStream.write(NativeS3FileSystem.java:623) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.avro.file.DataFileWriter$BufferedFileOutputStream$PositionFilter.write(DataFileWriter.java:446) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:121) at org.apache.avro.io.BufferedBinaryEncoder$OutputStreamSink.innerWrite(BufferedBinaryEncoder.java:216) at org.apache.avro.io.BufferedBinaryEncoder.writeFixed(BufferedBinaryEncoder.java:150) at org.apache.avro.file.DataFileStream$DataBlock.writeBlockTo(DataFileStream.java:366) at org.apache.avro.file.DataFileWriter.writeBlock(DataFileWriter.java:383) at org.apache.avro.file.DataFileWriter.sync(DataFileWriter.java:401) at org.apache.avro.file.DataFileWriter.flush(DataFileWriter.java:410) at org.apache.avro.file.DataFileWriter.close(DataFileWriter.java:433) at org.apache.avro.mapreduce.AvroKeyRecordWriter.close(AvroKeyRecordWriter.java:83) at org.apache.avro.mapreduce.AvroMultipleOutputs.close(AvroMultipleOutputs.java:595) at com.audiencescience.adprofilebuilder.mapreduce.AdProfileBuilderReducer.cleanup(AdProfileBuilderReducer.java:156) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1635) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

Thanks

Shubham

3 REPLIES 3
Highlighted

Re: Exception while while writing to S3n in Hadoop Job

Mentor

Have you tried using S3A? It is more capable than S3N, though I'm not familiar with writing directly to S3 with MR. You can also validate your application writing to hdfs first and then use hdfs api to copy to S3. Ultimately, if you can provide the code snippet we can help better.

Highlighted

Re: Exception while while writing to S3n in Hadoop Job

Explorer

@Artem Ervits

Thanks for suggestion Artem!. I did not try s3a. Write to S3 from MR job is working in some other jobs which I tested(~1 TB). Here o/p data size is large (~40 TB). I checked few reducers succeeded as well for this job, others failed with above mentioned error. Below given is the code snippet.

   if (jobName.isEmpty()) {
      jobName = AD_PROFILE_BUILDER_JOBNAME;
    }
    Job job = Job.getInstance(configuration, jobName);
    job.setJarByClass(AdProfileBuilderJob.class);
    job.setNumReduceTasks(numReducers);
    job.setInputFormatClass(AvroKeyInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);
    job.setMapperClass(AdProfileBuilderMapper.class);
    job.setPartitionerClass(AdProfileBuilderPartitioner.class);
    job.setReducerClass(AdProfileBuilderReducer.class);
    job.setMapOutputKeyClass(AdProfileBuilderKey.class);
    job.setGroupingComparatorClass(AdProfileBuilderKeyGroupingComparator.class);
    job.setSortComparatorClass(AdProfileBuilderSortComparator.class);
    AvroJob.setMapOutputValueSchema(job, AdProfile.SCHEMA$);
    AvroJob.setInputKeySchema(job, AdProfile.SCHEMA$);
    AvroMultipleOutputs.addNamedOutput(job, "adprofile", AvroKeyOutputFormat.class,
        AdProfile.SCHEMA$, null);
    FileInputFormat.setInputPaths(job, inputFilePaths.toArray(new Path[0]));
    FileOutputFormat.setOutputPath(job, this.getOutputPath());
    FileOutputFormat.setCompressOutput(job,
        AdProfileBuilderConfig.COMPRESS_OUTPUT.getBoolean(configuration));

Re: Exception while while writing to S3n in Hadoop Job

Explorer

It seems there is some problem with multipart upload on qubole servers. When disabled multipart upload, it worked fine. Will post when I have full information.

Don't have an account?
Coming from Hortonworks? Activate your account here