Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Blocks are getting marked as corrupt with append operation under high load

Blocks are getting marked as corrupt with append operation under high load

New Contributor

We are getting below error during high load append. Please let me know whether this issue is resolved or not  if yes in which CDH Ver.?

After investigation we got below JIRA but not sure whether this is related with this issue or not. Your quick answer would be highly appreciated.

JIRA : HDFS-3584

 

 

Here is the event :

I just ran another job, in debug mode. If you look at the output we are printing there every 1MB of data. Looking at that data you'll see that it fails at different points into the job, some after 148MB, 395MB, 466MB, etc… those are aren't event on split boundaries (128 MB). As I said on the call the other day I could see if it were a input split transitioning issue but this is clearly not that. It's just getting killed in the middle of processing rather which makes me think there's some resource / load monitor killing it, some time to live on a job, or something like that.

 

Log :

java.lang.RuntimeException: com.allstontrading.lib.common.historical.HistoricalParseError: java.io.IOException: Filesystem closed

        at com.allstontrading.hadoop.input.TimeBlockParallelRecordReader$NextTimeInstant$1.entry(TimeBlockParallelRecordReader.java:252)

        at com.allstontrading.hadoop.input.TimeBlockParallelRecordReader$NextTimeInstant$1.entry(TimeBlockParallelRecordReader.java:1)

        at com.allstontrading.lib.common.ds.collection.impl.ArrayBasedLinkedList.forEach(ArrayBasedLinkedList.java:241)

        at com.allstontrading.hadoop.input.TimeBlockParallelRecordReader$NextTimeInstant.processNext(TimeBlockParallelRecordReader.java:244)

        at com.allstontrading.hadoop.input.TimeBlockParallelRecordReader.nextKeyValue(TimeBlockParallelRecordReader.java:87)

        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:456)

        at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)

        at org.apache.hadoop.mapred.Child.main(Child.java:264)

Caused by: com.allstontrading.lib.common.historical.HistoricalParseError: java.io.IOException: Filesystem closed

        at com.allstontrading.hadoop.input.TimeBlockReaderImpl.processNext(TimeBlockReaderImpl.java:208)

        at com.allstontrading.hadoop.input.TimeBlockParallelRecordReader$NextTimeInstant$1.entry(TimeBlockParallelRecordReader.java:248)

        ... 14 more

Caused by: java.io.IOException: Filesystem closed

        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:272)

        at org.apache.hadoop.hdfs.DFSClient.access$900(DFSClient.java:71)

        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2159)

        at java.io.DataInputStream.readFully(DataInputStream.java:178)

        at java.io.DataInputStream.readFully(DataInputStream.java:152)

        at com.allstontrading.hadoop.input.TimeBlockReaderImpl.readNextBytes(TimeBlockReaderImpl.java:170)

        at com.allstontrading.hadoop.input.TimeBlockReaderImpl.readNextTime(TimeBlockReaderImpl.java:146)

        at com.allstontrading.hadoop.input.TimeBlockReaderImpl.processNext(TimeBlockReaderImpl.java:197)

 

 

 

2 REPLIES 2

Re: Blocks are getting marked as corrupt with append operation under high load

Master Guru
Make sure you aren't calling FileSystem.close() anywhere in your code, as the framework expects to be calling that.

Or alternatively, pass a job configuration value of "fs.hdfs.impl.disable.cache" set to "true", so the instance of FileSystem your program closes is not shared with the framework's own FileSystem instance.

Re: Blocks are getting marked as corrupt with append operation under high load

Master Guru
Additionally, on topic of replicas being marked corrupt, that may be cause of appends causing genstamp changes plus pipeline failures, causing some DNs to report the old genstamp and getting themselves marked as a corrupt replica holder. This should not be a concern though.

See also https://issues.apache.org/jira/browse/HDFS-5189