Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Cloudera Employee

I don't think 5.7 has that. 

 

I think the chance to interrupt a running distcp is rare. And even with that, you can still use distcp to safely copy except it might take a much longer time to finish than the snapshot way. If you can accept that, it is still a good option.

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Explorer

Hello Yufei,

 

Do we have an ETA for the snapshot rollback availability? How do you think it will solve our issue?

 

 

Highlighted

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Cloudera Employee

No earlier than cdh-5.9 as far as I know. If it is available, it can solve the issue. 

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Explorer

Hello,

CDH 5.9 finally doesn't support distcp snapshot rollback

Could you tell us which CDH version is going to support it ? 

Thank you

Vladi

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Cloudera Employee

HDFS-9820 is in CDH-5.9.1 and later versions. 

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Explorer

Hi,

 

Do you think using -atomic and save the temp file outside the snaphotable directory can help in such case?

Hope the rollback snapshot feature will be available in 5.9.

 

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Explorer

Do you think using -atomic=true will solve the issue?

 

i'm still not sure what is the cause that it's lists all source files, but i'm thinking since the destination folder has been changed since some mappers finish and commit, i'm thinking if using the -atomic and keeping the temp folder out of the destination folder should solve the issue.

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Explorer
Thank you ! Great news

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Cloudera Employee

Nope. It also gonna be in CDH-5.5.6, CDH-5.7.6 and CDH-5.8.4. 

Re: Killing the Distcp which running over snapshot listing all snapshottable path in the next run

Super Collaborator

Hi Yufei,

 

Is there a documentation how to implement snapshot restore in distcp?