Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar
Rising Star

This is an unsupported technology and a concept which hasn't been explored yet.

There's no real modification time concept in object stores. It has just creation time, which is that of the observed time at the far end. If you upload a file to a remote timezone, you may get that as your time.

The underlying issue here is not a bug. It is just a feature that distcp -update relies on using file checksums for comparing HDFS files, and (a) not all stores export their checksum through the Hadoop API (WASB does, s3a doesn't yet).

In addition, because the checksums are different between blobstores and HDFS, you can't use checksum difference as a cue for files being changed.

Note that this also occurs when trying to copy between HDFS encryption zones, as the checksums of the encrypted files will differ.

462 Views