Reply
Expert Contributor
Posts: 181
Registered: ‎01-25-2017

Distcp copy start to fail after the upgrade to CDH5.10

Hi,

 

I had a distcp that was working between active farm and DR, lastly i upgraded the DR farm to 5.10.0 while the active farm still with 5.5.4, when i'm running:

 

hadoop distcp -Dmapreduce.job.name=reporting -update -p -m 80 -strategy dynamic -diff s0 s1 webhdfs://${SRC_SITE}/liveperson/data/server_live-engage-mr /liveperson/data/remote/DC=${DC}/server_live-engage-mr/

 

i got the error:

 

17/03/26 11:03:31 ERROR tools.DistCp: Exception encountered
java.lang.Exception: DistCp sync failed, input options: DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=true, useRdiff=false, fromSnapshot=s0, toSnapshot=s1, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=80, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='dynamic', preserveStatus=[REPLICATION, BLOCKSIZE, USER, GROUP, PERMISSION, CHECKSUMTYPE, TIMES], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[hdfs://AlphaProd/liveperson/data/server_live-engage-mr/.snapshot/s1], targetPath=/liveperson/data/remote/DC=Alpha/server_live-engage-mr, targetPathExists=true, filtersFile='null'}
at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:84)
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:179)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:141)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)

 

 

i changed the hdfs to webhdfs as now the 2 cluster with different versions:

 

hadoop distcp -Dmapreduce.job.name=reporting -update -p -m 80 -strategy dynamic -diff s0 s1 webhdfs://${SRC_SITE}/liveperson/data/server_live-engage-mr /liveperson/data/remote/DC=${DC}/server_live-engage-mr/

 

and i'm getting the following error:

 

17/03/26 11:05:42 INFO client.RMProxy: Connecting to ResourceManager at aoor-mhc101.lpdomain.com/10.26.180.76:8032
17/03/26 11:05:42 ERROR tools.DistCp: Exception encountered
java.lang.IllegalArgumentException: The FileSystems needs to be DistributedFileSystem for using snapshot-diff-based distcp
at org.apache.hadoop.tools.DistCpSync.preSyncCheck(DistCpSync.java:98)
at org.apache.hadoop.tools.DistCpSync.sync(DistCpSync.java:147)
at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:81)
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:179)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:141)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)

 

 

 

 

 

 

Expert Contributor
Posts: 181
Registered: ‎01-25-2017

Re: Distcp copy start to fail after the upgrade to CDH5.10

Now i'm getting different error:

 

17/03/29 11:40:08 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=true, useRdiff=false, fromSnapshot=s0, toSnapshot=s1, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=100, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='dynamic', preserveStatus=[REPLICATION, BLOCKSIZE, USER, GROUP, PERMISSION, CHECKSUMTYPE, TIMES], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[hdfs://AlphaProd/liveperson/data/server_live-engage-mr/output], targetPath=hdfs://AlphaDR/liveperson/data/remote/DC=Alpha/server_live-engage-mr/output, targetPathExists=true, filtersFile='null'}
17/03/29 11:40:09 WARN tools.DistCp: The target has been modified since snapshot s0
17/03/29 11:40:09 ERROR tools.DistCp: Exception encountered
java.lang.Exception: DistCp sync failed, input options: DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=true, useRdiff=false, fromSnapshot=s0, toSnapshot=s1, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=100, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='dynamic', preserveStatus=[REPLICATION, BLOCKSIZE, USER, GROUP, PERMISSION, CHECKSUMTYPE, TIMES], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[hdfs://AlphaProd/liveperson/data/server_live-engage-mr/output/.snapshot/s1], targetPath=hdfs://AlphaDR/liveperson/data/remote/DC=Alpha/server_live-engage-mr/output, targetPathExists=true, filtersFile='null'}
at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:84)
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:179)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:141)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)

 

 

Expert Contributor
Posts: 181
Registered: ‎01-25-2017

Re: Distcp copy start to fail after the upgrade to CDH5.10

rolling back the CDH version is solving the issue, so i'm so confused as i'm also unable to get any documntation of the last improvments in the distcp at CDH5.10

Expert Contributor
Posts: 181
Registered: ‎01-25-2017

Re: Distcp copy start to fail after the upgrade to CDH5.10

With Debug level:

 

17/03/30 08:30:19 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
17/03/30 08:30:19 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
17/03/30 08:30:19 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
17/03/30 08:30:19 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
17/03/30 08:30:19 DEBUG security.SecurityUtil: Setting hadoop.security.token.service.use_ip to true
17/03/30 08:30:19 DEBUG security.Groups: Creating new Groups object
17/03/30 08:30:19 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000; warningDeltaMs=5000
17/03/30 08:30:19 DEBUG security.UserGroupInformation: hadoop login
17/03/30 08:30:19 DEBUG security.UserGroupInformation: hadoop login commit
17/03/30 08:30:19 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: cloudera-scm
17/03/30 08:30:19 DEBUG security.UserGroupInformation: Using user: "UnixPrincipal: cloudera-scm" with name cloudera-scm
17/03/30 08:30:19 DEBUG security.UserGroupInformation: User entry: "cloudera-scm"
17/03/30 08:30:19 DEBUG security.UserGroupInformation: Assuming keytab is managed externally since logged in from subject.
17/03/30 08:30:19 DEBUG security.UserGroupInformation: UGI loginUser:cloudera-scm (auth:SIMPLE)
17/03/30 08:30:19 DEBUG core.Tracer: sampler.classes = ; loaded no samplers
17/03/30 08:30:19 DEBUG core.Tracer: span.receiver.classes = ; loaded no span receivers
17/03/30 08:30:19 DEBUG azure.NativeAzureFileSystem: finalize() called.
17/03/30 08:30:19 DEBUG azure.NativeAzureFileSystem: finalize() called.
17/03/30 08:30:19 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
17/03/30 08:30:19 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = false
17/03/30 08:30:19 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
17/03/30 08:30:19 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /hadoop/sockets/hdfs-sockets/dn
17/03/30 08:30:20 DEBUG hdfs.HAUtil: No HA service delegation token found for logical URI hdfs://AlphaDR/liveperson/data/remote/DC=Alpha/server_live-engage-mr/output
17/03/30 08:30:20 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
17/03/30 08:30:20 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = false
17/03/30 08:30:20 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
17/03/30 08:30:20 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /hadoop/sockets/hdfs-sockets/dn
17/03/30 08:30:20 DEBUG retry.RetryUtils: multipleLinearRandomRetry = null
17/03/30 08:30:20 DEBUG ipc.Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@550049b6
17/03/30 08:30:20 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@44ff60de
17/03/30 08:30:20 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library...
17/03/30 08:30:20 DEBUG util.NativeCodeLoader: Loaded the native-hadoop library
17/03/30 08:30:20 DEBUG unix.DomainSocketWatcher: org.apache.hadoop.net.unix.DomainSocketWatcher$2@258cde9c: starting with interruptCheckPeriodMs = 60000
17/03/30 08:30:20 DEBUG util.PerformanceAdvisory: Both short-circuit local reads and UNIX domain socket are disabled.
17/03/30 08:30:20 DEBUG sasl.DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection
17/03/30 08:30:20 DEBUG ipc.Client: The ping interval is 60000 ms.
17/03/30 08:30:20 DEBUG ipc.Client: Connecting to aoor-mhc102.lpdomain.com/10.26.180.77:8020
17/03/30 08:30:20 DEBUG ipc.Client: IPC Client (1983511229) connection to aoor-mhc102.lpdomain.com/10.26.180.77:8020 from cloudera-scm: starting, having connections 1
17/03/30 08:30:20 DEBUG ipc.Client: IPC Client (1983511229) connection to aoor-mhc102.lpdomain.com/10.26.180.77:8020 from cloudera-scm sending #0
17/03/30 08:30:20 DEBUG ipc.Client: IPC Client (1983511229) connection to aoor-mhc102.lpdomain.com/10.26.180.77:8020 from cloudera-scm got value #0
17/03/30 08:30:20 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 55ms
17/03/30 08:30:20 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=true, useRdiff=false, fromSnapshot=s0, toSnapshot=s1, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=100, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='dynamic', preserveStatus=[REPLICATION, BLOCKSIZE, USER, GROUP, PERMISSION, CHECKSUMTYPE, TIMES], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[hdfs://AlphaProd/liveperson/data/server_live-engage-mr/output], targetPath=hdfs://AlphaDR/liveperson/data/remote/DC=Alpha/server_live-engage-mr/output, targetPathExists=true, filtersFile='null'}
17/03/30 08:30:20 DEBUG mapreduce.Cluster: Trying ClientProtocolProvider : org.apache.hadoop.mapred.YarnClientProtocolProvider
17/03/30 08:30:20 DEBUG service.AbstractService: Service: org.apache.hadoop.mapred.ResourceMgrDelegate entered state INITED
17/03/30 08:30:20 DEBUG service.AbstractService: Service: org.apache.hadoop.yarn.client.api.impl.YarnClientImpl entered state INITED
17/03/30 08:30:20 DEBUG security.UserGroupInformation: PrivilegedAction as:cloudera-scm (auth:SIMPLE) from:org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:136)
17/03/30 08:30:20 DEBUG ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
17/03/30 08:30:20 DEBUG ipc.HadoopYarnProtoRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ApplicationClientProtocol
17/03/30 08:30:20 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@44ff60de
17/03/30 08:30:20 DEBUG service.AbstractService: Service org.apache.hadoop.yarn.client.api.impl.YarnClientImpl is started
17/03/30 08:30:20 DEBUG service.AbstractService: Service org.apache.hadoop.mapred.ResourceMgrDelegate is started
17/03/30 08:30:21 DEBUG security.UserGroupInformation: PrivilegedAction as:cloudera-scm (auth:SIMPLE) from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:335)
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /hadoop/sockets/hdfs-sockets/dn
17/03/30 08:30:21 DEBUG hdfs.HAUtil: No HA service delegation token found for logical URI hdfs://AlphaDR
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /hadoop/sockets/hdfs-sockets/dn
17/03/30 08:30:21 DEBUG retry.RetryUtils: multipleLinearRandomRetry = null
17/03/30 08:30:21 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@44ff60de
17/03/30 08:30:21 DEBUG sasl.DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection
17/03/30 08:30:21 DEBUG crypto.OpensslAesCtrCryptoCodec: Using org.apache.hadoop.crypto.random.OsSecureRandom as random number generator.
17/03/30 08:30:21 DEBUG util.PerformanceAdvisory: Using crypto codec org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.
17/03/30 08:30:21 DEBUG mapreduce.Cluster: Picked org.apache.hadoop.mapred.YarnClientProtocolProvider as the ClientProtocolProvider
17/03/30 08:30:21 DEBUG mapred.ResourceMgrDelegate: getStagingAreaDir: dir=/user/cloudera-scm/.staging
17/03/30 08:30:21 DEBUG ipc.Client: IPC Client (1983511229) connection to aoor-mhc102.lpdomain.com/10.26.180.77:8020 from cloudera-scm sending #1
17/03/30 08:30:21 DEBUG ipc.Client: IPC Client (1983511229) connection to aoor-mhc102.lpdomain.com/10.26.180.77:8020 from cloudera-scm got value #1
17/03/30 08:30:21 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 2ms
17/03/30 08:30:21 DEBUG ipc.Client: IPC Client (1983511229) connection to aoor-mhc102.lpdomain.com/10.26.180.77:8020 from cloudera-scm sending #2
17/03/30 08:30:21 DEBUG ipc.Client: IPC Client (1983511229) connection to aoor-mhc102.lpdomain.com/10.26.180.77:8020 from cloudera-scm got value #2
17/03/30 08:30:21 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 1ms
17/03/30 08:30:21 DEBUG tools.DistCp: Meta folder location: /user/cloudera-scm/.staging/_distcp425151441
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /hadoop/sockets/hdfs-sockets/dn
17/03/30 08:30:21 DEBUG hdfs.HAUtil: No HA service delegation token found for logical URI hdfs://AlphaProd/liveperson/data/server_live-engage-mr/output
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
17/03/30 08:30:21 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /hadoop/sockets/hdfs-sockets/dn
17/03/30 08:30:21 DEBUG retry.RetryUtils: multipleLinearRandomRetry = null
17/03/30 08:30:21 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@44ff60de
17/03/30 08:30:21 DEBUG sasl.DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection
17/03/30 08:30:21 DEBUG ipc.Client: IPC Client (1983511229) connection to aoor-mhc102.lpdomain.com/10.26.180.77:8020 from cloudera-scm sending #3
17/03/30 08:30:22 DEBUG ipc.Client: IPC Client (1983511229) connection to aoor-mhc102.lpdomain.com/10.26.180.77:8020 from cloudera-scm got value #3
17/03/30 08:30:22 DEBUG ipc.ProtobufRpcEngine: Call: getSnapshotDiffReport took 1222ms
17/03/30 08:30:22 WARN tools.DistCp: The target has been modified since snapshot s0
17/03/30 08:30:22 DEBUG ipc.Client: IPC Client (1983511229) connection to aoor-mhc102.lpdomain.com/10.26.180.77:8020 from cloudera-scm sending #4
17/03/30 08:30:22 DEBUG ipc.Client: IPC Client (1983511229) connection to aoor-mhc102.lpdomain.com/10.26.180.77:8020 from cloudera-scm got value #4
17/03/30 08:30:22 DEBUG ipc.ProtobufRpcEngine: Call: delete took 1ms
17/03/30 08:30:22 ERROR tools.DistCp: Exception encountered
java.lang.Exception: DistCp sync failed, input options: DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=true, useRdiff=false, fromSnapshot=s0, toSnapshot=s1, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=100, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='dynamic', preserveStatus=[REPLICATION, BLOCKSIZE, USER, GROUP, PERMISSION, CHECKSUMTYPE, TIMES], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[hdfs://AlphaProd/liveperson/data/server_live-engage-mr/output/.snapshot/s1], targetPath=hdfs://AlphaDR/liveperson/data/remote/DC=Alpha/server_live-engage-mr/output, targetPathExists=true, filtersFile='null'}
at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:84)
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:179)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:141)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)
17/03/30 08:30:22 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@44ff60de
17/03/30 08:30:22 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@44ff60de
+ '[' 25 -eq 0 ']'

Expert Contributor
Posts: 181
Registered: ‎01-25-2017

Re: Distcp copy start to fail after the upgrade to CDH5.10

Pleeeeeeeeeeeeeeeease help.

Posts: 1,491
Kudos: 246
Solutions: 226
Registered: ‎07-31-2013

Re: Distcp copy start to fail after the upgrade to CDH5.10

With the same major version, you can continue to use hdfs://. The DistCp matrix has a supported protocol entry about this: https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_admin_distcp_data_cluster_migrat...

The use of the 'diff' option with webhdfs:// is not supported. It only works with hdfs://, which would explain your error: https://github.com/cloudera/hadoop-common/blob/cdh5.5.4-release/hadoop-tools/hadoop-distcp/src/main/... or https://github.com/cloudera/hadoop-common/blob/cdh5.10.0-release/hadoop-tools/hadoop-distcp/src/main...

Does this help?
Backline Customer Operations Engineer
Expert Contributor
Posts: 181
Registered: ‎01-25-2017

Re: Distcp copy start to fail after the upgrade to CDH5.10

[ Edited ]

Hi Harsh,

 

But as you can see my last comments, i'm just usin hdfs:// and still get the same error and when i rollback the cdh to 5.9.0 it back to work.

 

The error occur when  The target has been modified since snapshot s0, which means once the ditscp snapshot run failed one time, it stop working.

 

Can you also help me to get a documntation of how to use the snapshot restore as my main uprade reason is to use this utility in my distcp.

Expert Contributor
Posts: 181
Registered: ‎01-25-2017

Re: Distcp copy start to fail after the upgrade to CDH5.10

@Harsh J hope you can help me with this

Expert Contributor
Posts: 181
Registered: ‎01-25-2017

Re: Distcp copy start to fail after the upgrade to CDH5.10

@Harsh J just need a small help with the command as it not mentioned in the documntation of in the help in the command line.

 

If i want to backup the source_folder in the active farm to the disaster recovery farm under destination_folder and i want to run -rdiff at the destination_folder.

the source_file has snapshot s0 and s1 and destination has snapshot s0,which already modified as the partial completed distcp process that failed.

 

So the current state, the destination_folder has some file not in destinatin snapshot s0 and i want to revert it to s0, so i created s1 at destination, how the revert command should looks like:

 

hadoop distcp s1 s0 destination_folder  source_file or hadoop distcp s1 s0 source_file destination_folder?

 

I assume that s1 s0 are related to the snapshots of the destination folder, is it?

 

Expert Contributor
Posts: 181
Registered: ‎01-25-2017

Re: Distcp copy start to fail after the upgrade to CDH5.10

I got it,

 

The right order is hadoop distcp -rdiff s1 s0 source destination

Announcements