I copied a large folder structure from HDFS to RedHat using copyToLocal. While it looked successful, i want to validate it was copied correctly by checking the size of the data in HDFS and in RedHat. I'm using "du" but my numbers are still off.
I run the following on RH: "du -s -b <PATH>"
I run the following on HDFS: "hadoop fs -du -s <PATH>"
I noticed that RH reports 101 bytes for empty folders while HDFS (CDH5.5.2) reports 0 bytes for empty folders. So my question is, how to I validate the entire directory of data was fully transferred?
Thanks,