Reply
New Contributor
Posts: 1
Registered: ‎04-15-2014

Blocks are getting marked as corrupt with append operation under high load

We are getting below error during high load append. Please let me know whether this issue is resolved or not  if yes in which CDH Ver.?

After investigation we got below JIRA but not sure whether this is related with this issue or not. Your quick answer would be highly appreciated.

JIRA : HDFS-3584

 

 

Here is the event :

I just ran another job, in debug mode. If you look at the output we are printing there every 1MB of data. Looking at that data you'll see that it fails at different points into the job, some after 148MB, 395MB, 466MB, etc… those are aren't event on split boundaries (128 MB). As I said on the call the other day I could see if it were a input split transitioning issue but this is clearly not that. It's just getting killed in the middle of processing rather which makes me think there's some resource / load monitor killing it, some time to live on a job, or something like that.

 

Log :

java.lang.RuntimeException: com.allstontrading.lib.common.historical.HistoricalParseError: java.io.IOException: Filesystem closed

        at com.allstontrading.hadoop.input.TimeBlockParallelRecordReader$NextTimeInstant$1.entry(TimeBlockParallelRecordReader.java:252)

        at com.allstontrading.hadoop.input.TimeBlockParallelRecordReader$NextTimeInstant$1.entry(TimeBlockParallelRecordReader.java:1)

        at com.allstontrading.lib.common.ds.collection.impl.ArrayBasedLinkedList.forEach(ArrayBasedLinkedList.java:241)

        at com.allstontrading.hadoop.input.TimeBlockParallelRecordReader$NextTimeInstant.processNext(TimeBlockParallelRecordReader.java:244)

        at com.allstontrading.hadoop.input.TimeBlockParallelRecordReader.nextKeyValue(TimeBlockParallelRecordReader.java:87)

        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:456)

        at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)

        at org.apache.hadoop.mapred.Child.main(Child.java:264)

Caused by: com.allstontrading.lib.common.historical.HistoricalParseError: java.io.IOException: Filesystem closed

        at com.allstontrading.hadoop.input.TimeBlockReaderImpl.processNext(TimeBlockReaderImpl.java:208)

        at com.allstontrading.hadoop.input.TimeBlockParallelRecordReader$NextTimeInstant$1.entry(TimeBlockParallelRecordReader.java:248)

        ... 14 more

Caused by: java.io.IOException: Filesystem closed

        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:272)

        at org.apache.hadoop.hdfs.DFSClient.access$900(DFSClient.java:71)

        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2159)

        at java.io.DataInputStream.readFully(DataInputStream.java:178)

        at java.io.DataInputStream.readFully(DataInputStream.java:152)

        at com.allstontrading.hadoop.input.TimeBlockReaderImpl.readNextBytes(TimeBlockReaderImpl.java:170)

        at com.allstontrading.hadoop.input.TimeBlockReaderImpl.readNextTime(TimeBlockReaderImpl.java:146)

        at com.allstontrading.hadoop.input.TimeBlockReaderImpl.processNext(TimeBlockReaderImpl.java:197)

 

 

 

Posts: 1,826
Kudos: 406
Solutions: 292
Registered: ‎07-31-2013

Re: Blocks are getting marked as corrupt with append operation under high load

Make sure you aren't calling FileSystem.close() anywhere in your code, as the framework expects to be calling that.

Or alternatively, pass a job configuration value of "fs.hdfs.impl.disable.cache" set to "true", so the instance of FileSystem your program closes is not shared with the framework's own FileSystem instance.
Posts: 1,826
Kudos: 406
Solutions: 292
Registered: ‎07-31-2013

Re: Blocks are getting marked as corrupt with append operation under high load

Additionally, on topic of replicas being marked corrupt, that may be cause of appends causing genstamp changes plus pipeline failures, causing some DNs to report the old genstamp and getting themselves marked as a corrupt replica holder. This should not be a concern though.

See also https://issues.apache.org/jira/browse/HDFS-5189
Announcements