Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Highlighted

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Super Collaborator

@Yongjun Zhang Can you please reply to my last comment regarding using the diff only with full listing in case of failures.

Highlighted

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Cloudera Employee
Hi Fawze,

You can choose not to use diff, or you run diff, if it fails, fallback to
another distcp command without diff that does the regular distcp (with
-update -delete).

Would you please also answer my questions in my last comment?

Thanks.
Highlighted

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Cloudera Employee

Thanks for the clarification. Sorry I missed this reply earlier. Good to know that it's not resulted from distcp. So there is no snapshot opertaion failure message even if it failed?

 

 

 

Highlighted

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Super Collaborator

@Yongjun Zhang you last suggestion to issue distcp after the diff failure is making our life more complex since i need to delete 4 snapshots, create new s0 snapshot , issue distcp and then create s0 at destination.

 

I still wondering why the full listing in case of failures was disabled in the new version.

Highlighted

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Cloudera Employee
Because -diff requires not to pass -delete, however, the user may or may
not want to have -delete when running without -diff. If you run with -diff,
even if we can fallback to run regular distcp, the software doesn't know
whether user want to do -delete and can not make the decision for user. One
possibility is to add a new switch to enable that.

BTW, do you see error message when snapshot operations failed?

Thanks.
Highlighted

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Super Collaborator

In both case if user passes -delete or not in the reqular distcp after the fallback, the -diff in the next run will correct the situation.

 

Yes, in case of snapshot error, we are getting the network issue message like connection timeout between node xxx and namenodexx:8020, to manage different errors to each snapshot in one cron is adding more compexity to the snapshot cycle management.

 

More important, such changes that is not backward compaitible should be communicated or mentioned in the release notes or in the rdiff documntation, imagine that i want to upgrade my cluster, and after the upgrade either i will do rollback because the -rdiff ot i need to find a solution and implement it on time.

 

I think there is should be another switch case in the code that gives the user more opportunitites.

 

Highlighted

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Cloudera Employee
HI Fawze,

"In both case if user passes -delete or not in the reqular distcp after the
fallback, the -diff in the next run will correct the situation."
I am not sure I follow the above statement.

The fallback feature was disabled in HDFS-10313. I created the following
jira:

https://issues.apache.org/jira/browse/HDFS-11706

to re-enable.

Before we have that jira implemented, I think if you could make your script
to detect the failure, then you can have the script to re-issue a regular
distcp command as a manual fallback.

Thanks.

--Yongjun

Highlighted

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Cloudera Employee
HI Fawze,

"In both case if user passes -delete or not in the reqular distcp after the
fallback, the -diff in the next run will correct the situation."
I am not sure I follow the above statement.

The fallback feature was disabled in HDFS-10313. I created the following
jira:

https://issues.apache.org/jira/browse/HDFS-11706

to re-enable.

Before we have that jira implemented, I think if you could make your script
to detect the failure, then you can have the script to re-issue a regular
distcp command as a manual fallback.

Thanks.

--Yongjun

Highlighted

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Super Collaborator

@Yongjun Zhang Hi Yongjun, Do you think this task can be done in the near new CDH versions?

 

https://issues.apache.org/jira/browse/HDFS-11706

Highlighted

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Super Collaborator

@Yongjun Zhang Hi Yongjun, Do you think this task can be done in the near new CDH versions?

 

https://issues.apache.org/jira/browse/HDFS-11706

Don't have an account?
Coming from Hortonworks? Activate your account here