Support Questions

Find answers, ask questions, and share your expertise

how to identify edits_inprogress_xxxxxxx corrupted file

avatar

we noticed that edits_inprogress_xxxxxxxx corrupted file (under /hadoop/hdfs/journal/hdfsha/current) , could be the reason why name--node not start correctly or start as standby and not as active

so I just share with you this issue , and I want to find little verification how to identify edits_inprogress_xxxxxxxx corrupted file

I found the following way but I need hortonworks approval for this verification

my opinion is to use the file command , when file command return "data" then file is ok

else , then edits_inprogress_xxxxxxxx corrupted is corrupted

can we trust on this verification ?

file edits_inprogress_0000000000075117774
edits_inprogress_0000000000075117774: ISO-8859 text, with very long lines, with no line terminators 

# file edits_inprogress_0000000000075149670
edits_inprogress_0000000000075149670: data
Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Michael Bronson

There is no standard way to determine the Corruption of edits_inprogress_xxxx files. But usually we get the indication of it's corruption by looking at the Journal Node / NameNode logs.

.

Brief of edits_inprogress__start transaction ID– This is the current edit log in progress. All transactions starting fromare in this file, and all new incoming transactions will get appended to this file. HDFS pre-allocates space in this file in 1 MB chunks for efficiency, and then fills it with incoming transactions. You’ll probably see this file’s size as a multiple of 1 MB. When HDFS finalizes the log segment, it truncates the unused portion of the space that doesn’t contain any transactions, so the finalized file’s space will shrink down.

More details about these files and it's functionality can be found at: https://hortonworks.com/blog/hdfs-metadata-directories-explained/

.

View solution in original post

4 REPLIES 4

avatar
Master Mentor

@Michael Bronson

There is no standard way to determine the Corruption of edits_inprogress_xxxx files. But usually we get the indication of it's corruption by looking at the Journal Node / NameNode logs.

.

Brief of edits_inprogress__start transaction ID– This is the current edit log in progress. All transactions starting fromare in this file, and all new incoming transactions will get appended to this file. HDFS pre-allocates space in this file in 1 MB chunks for efficiency, and then fills it with incoming transactions. You’ll probably see this file’s size as a multiple of 1 MB. When HDFS finalizes the log segment, it truncates the unused portion of the space that doesn’t contain any transactions, so the finalized file’s space will shrink down.

More details about these files and it's functionality can be found at: https://hortonworks.com/blog/hdfs-metadata-directories-explained/

.

avatar

what we need to focus/capture on the log in order to get the feeling that file is corepted|?

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Usually we see some edite_inprogress scaning warning messages in the log which indicates some kind of corruption in the "edits_inprogress" file.


Example: (most common example in case of 90% of times the during corruption).

WARN namenode.FSImage (EditLogFileInputStream.java:scanEditLog(364)) - After resync, position is 552231
WARN namenode.FSImage (EditLogFileInputStream.java:scanEditLog(359)) - Caught exception after scanning through 0 ops from /hadoop/hdfs/journal/haclusterName/current/edits_inprogress_0000000000512299614 while determining its valid length. Position was 552231  java.io.IOException: Can't scan a pre-transactional edit log. 
<br>

.

avatar

@Jay excellent answer ,

Michael-Bronson