Reply
Explorer
Posts: 7
Registered: ‎04-15-2015

Missing blocks

I'm using 2.6.0-cdh5.4.5, using a Flume sink to write to HDFS periodically.  This works out well, but rarely I see missing blocks.  I've gone through the FAQ and likely culprits with no light being shed on it.  I'm wondering if someone can give pointers on what else I may be able to look into?

 

In this latest situation I have 1 missing blocks with replication=3.  I've given the highlights from two pertinent logs below - other datanodes look pretty much like this one.  It's like the file never was written to disk and the DirectoryScanner cleans it from memory wiping away its existence.  But then, that'd be pretty odd since it works 99.9% of the time.  I've checked /var/log/messages around the same time on all nodes and observe no anomalies.  I'll note that another flume agent is writing at the same time successfully as well.

 

Any ideas are greatly appreciated. 

 

Thanks,

--tim

 

NN:

Day 1: 06:02 - the file completes

Day 1: 06:02 - the blockMap updated with the three replicas as UNDER_CONSTRUCTION

Day 1: 06:02 - the file is closed

Day 1: 06:02 - the FSNamesystem claiming there are no corrupt file blocks

Day 2: 07:25 - BlockStateChanged processReport...

Day 2: 07:25 - ask node 1 to replicate blk123 to node 2

Day 2: 07:25 - Error report DataNodeRegistration(node1 can't send invalid block 123)

 

Node 1:

Day 1: 06:02 - PacketResponder blk123 type=HAS_DOWNSTREAM_IN_PIPELINE is terminating

Day 2: 04:51 - FSDatasetImpl: Removed block 123 from memory  with missing block file on the disk.

Day 2: 04:51 - FSDatasetImpl: Deleted a metadata file for the deleted block ...

Day 2: 07:25 - Can't send invalid block blk123

Champion Alumni
Posts: 196
Registered: ‎11-18-2014

Re: Missing blocks

Hi,

 

1. I think you should try to identify what file is missing:

hadoop fsck /

 2. Are all your DN up and runnning?

 

I know that you can get corrupt blocks if you are writting very small files (when flume is opening a file and then it has to close it but there is no written information ).

GHERMAN Alina
Explorer
Posts: 7
Registered: ‎04-15-2015

Re: Missing blocks

Thanks for the reply Alina!
Yeah, I can see what file it is. And, yes, all data nodes are up.

The situation you describe - writing very small files, then closing could be it. How can I confirm that is indeed the case? In that scenario, it's just an empty file that's seen by HDFS as corrupt?
Champion Alumni
Posts: 196
Registered: ‎11-18-2014

Re: Missing blocks

Hello,

 

Yes, as far as I managed to find out, it is just a file that you cannot open (because it's corrupt) and is empty (see the size of the file).

 

Alina

 

 

GHERMAN Alina