Recipes framework capability to support HDFS and Hive mirroring was added in Apache Falcon 0.6.0 release and it was a client side logic. With 0.10 release its moved to server side and renamed as server side extensions as part of jira https://issues.apache.org/jira/browse/FALCON-1107.
How replication / DistCp job works e.g. mapper writes to temp directory on source name-node and copy to target once done.
What if jobs replicating 100 files. fails at mapper end.
If a copier failed for some subset of its files what will happen, A directory will become inconsistent ?
Is Atomic feature supported in HDP2.5, how the data inconsistency will be taken care in case of Job failure.
E.g If there are 200 GB files in a directory source which has been changed and replication jobs replicating the data to target fails. In case 100 GB data has been written at target dirctory and fails. Will it be rolled back to the previous state of only 100 GB will be written at target ?
Assumption : we have 100s of files to be transferred, this file size is relatively bigger (130 GB), Block size is 124MB. overwrite = true.