Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

backup data from a cluster

Highlighted

backup data from a cluster

Contributor

Hi everybody

I'm actually working on a Cloudera test cluster. I want to reinstall the plateform and I need to save data currently on it. 

My question is: How can i do this backup?

thanks for your reaction

10 REPLIES 10
Highlighted

Re: backup data from a cluster

Champion

@securehadoop

 

There are multiple options, you can choose any suitable (some time combination of more than one)

 

a. Not sure you have two cluster and move/copy data between them (or)

b. Just copy from your cluster to local. 

 

1. distcp: to copy data between two clusters. (Suitable for option a, I never tried this for option b)

 

https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_admin_distcp_data_cluster_migrate...

 

2. copyToLocal: Suitable for option b

 

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html

 

3. Export hive table: Suitable for option b. You can export your hive table to HDFS path and apply copyToLocal. Note: I tried this for non-partitioned tables, please export and import the same data back to test the partitioned tables

 

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html

 

 

 

Highlighted

Re: backup data from a cluster

Contributor

Thanks for you response. 

you are right I don't have a cluster for backup I juste want to backup in local.

Highlighted

Re: backup data from a cluster

Contributor
 hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst>

This command can do the job but I have a question: where must i execute the command to be sure to copy all the data and not just et party? datanode or namenode

Highlighted

Re: backup data from a cluster

Champion

@securehadoop

 

I don't think any single command available to copy all the cluster data to local. You can copy the parent directory belongs to the corresponding service and zip it in local (if you have enough space in local)

 

it is neither datanode nor namenode (it is to your local)

Highlighted

Re: backup data from a cluster

Super Collaborator

@securehadoop

What is the size of your HDFS?

Highlighted

Re: backup data from a cluster

Contributor

hi and thanks for your response

I'm installing another cluster for the backup but i have some issue.

Highlighted

Re: backup data from a cluster

Champion

Could you let us know what is the issue , give us some logs 

Highlighted

Re: backup data from a cluster

Contributor
Highlighted

Re: backup data from a cluster

Champion

I had replied to it 

Don't have an account?
Coming from Hortonworks? Activate your account here