Support Questions

sameerkhanpqr · ‎02-24-2016

Hello Everyone , i have doubts related to hadoop checksum calculation :

In O'reilly i could see below line :

"Datanodes are responsible for verifying the data they receive before storing the data and its checksum "

1. Does this mean that checksum will be calculated before data reaches datanode for storage ???

"A client writing data sends it to a pipeline of datanodes and the last datanode in the pipeline verifies the checksum"

2. Why only last node should verify the checksum, Bit rot error can happen even in the initial data nodes as well while only last node has to verify it ???

"When clients read data from datanodes, they verify checksums as well, comparing them with the ones stored at the datanodes."

3. Will checksum of the data is stored at datanode along with the checksum during WRITE process ??

A separate checksum is created for every dfs.bytes-perchecksum bytes of data. The default is 512 bytes

4. Suppose i have a file of size 10 MB , as per above statement there will be 20 checksums which will get created , if suppose block size is 1 MB then as per i understood checksum has to be stored along with the block . So in this case each block will store 2 checksums with it ?????

Each datanode keeps a persistent log of checksum verifications, so it knows the last time each of its blocks was verified.

5. May i know what is the path of this log file and what this file will have exactly in it , im using cloudera VM machine ???

When a client successfully verifies a block, it tells the datanode, which updates its log . Keeping statistics such as these is valuable in detecting bad disks.

6. For the above log file in datanode , will writes happen only when client sends successful msg. What if client observe failures in checksum calculation.

cnauroth · ‎02-24-2016

Hello @sameer khan. Addressing the questions point-by-point:

1. Does this mean that checksum will be calculated before data reaches datanode for storage ???

Yes, an end-to-end checksum calculation is performed as part of the HDFS write pipeline while the block is being written to DataNodes.

2. Why only last node should verify the checksum, Bit rot error can happen even in the initial data nodes as well while only last node has to verify it ???

The intent of the checksum calculation in the write pipeline is to verify the data in transit over the network, not check bit rot on disk. Therefore, verification at the final node in the write pipeline is sufficient. Checking for bit rot in existing replicas on disk is performed separately at each DataNode by a background thread.

3. Will checksum of the data is stored at datanode along with the checksum during WRITE process ??

Yes, the checksum is persisted at the DataNode. For each block replica hosted by a DataNode, there is a corresponding metadata file that contains metadata about the replica, including its checksum information. The metadata file will have the same base name as the block file, and it will have an extension of ".meta".

4. Suppose i have a file of size 10 MB , as per above statement there will be 20 checksums which will get created , if suppose block size is 1 MB then as per i understood checksum has to be stored along with the block . So in this case each block will store 2 checksums with it ?????

The DataNode stores a single ".meta" file corresponding to each block replica. Within that metadata file, there is an internal data format for storage of multiple checksums of different byte ranges within that block replica. All checksums for all byte ranges must be valid in order for HDFS to consider the replica to be valid.

5. May i know what is the path of this log file and what this file will have exactly in it , im using cloudera VM machine ???

The files are prefixed with "dncp_block_verification.log" and will be stored under one of the DataNode data directories as configured by dfs.datanode.data.dir in hdfs-site.xml. The content of these files is multiple lines, each reporting date, time and block ID for a replica that was verified.

6. For the above log file in datanode , will writes happen only when client sends successful msg. What if client observe failures in checksum calculation.

This only logs checksum verification failures that were detected in the background by the DataNode. If a client detects a checksum failure at read time, then the client reports the failure to the NameNode, which then recovers by invalidating the corrupt replica and scheduling re-replication from another known good replica. There would be some logging in the NameNode log related to this activity.

View solution in original post

cnauroth · ‎02-24-2016