Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)

I have tried copying data between non-kerberized cluster to kerberized cluster using Snapshot diff, but it was failing with following error in HDP 2.5.3

Currently distcp only supports hdfs:// RPC protocol for snapshot based diff copy. If you use webhdfs either in source or target you will encouter this error.


$hadoop distcp -diff s1 s2 -update webhdfs://nonsecure_cluster:50070/source  hdfs://secure_cluster:8020/target


    • java.lang.IllegalArgumentException: The FileSystems needs to be DistributedFileSystem for using snapshot-diff-based distcp
    • at
    • at
    • at
    • at
    • at
    • at
    • at


    Are you able to update with the target with snapshot difference?

    hadoop distcp -diff s1 s2 -update hdfs://secure_cluster:8020/source hdfs://secure_cluster:8020/target

    though it is syntactically correct, I don't see its working, I see the error as below.

    17/04/18 14:29:53 ERROR tools.DistCp: Invalid arguments: java.lang.IllegalArgumentException: Diff is valid only with update and delete options at at at at at at

    Invalid arguments: Diff is valid only with update and delete options usage: distcp OPTIONS [source_path...] <target_path>


    Hello @Rajesh,

    Following the lovely request of a customer here, there is a HWKs tasks to support it with HDP3.0.

    If there is some way to backport it, I will try to update this thread.

    With kind regards


    Thanks for the update! I ran into this myself when designing a Python HDFS snapshot manager for two of our clusters.

    Not applicable

    Does anyone found a solution about this issue ?


    Take a Tour of the Community
    Don't have an account?
    Your experience may be limited. Sign in to explore more.
    Version history
    Last update:
    ‎02-22-2017 04:21 AM
    Updated by:
    Top Kudoed Authors