12-27-2017 08:04 AM
I am working on a backup task in DEV. I am doing the datanode backup to local. I have set the safemode is OFF and started the data backup. While in the middle of backup, I have noticed there is 1 under replicated blocks and 1 missing block. So i am planning to do the fix. So my doubt is, We can do the fix parallely ?
12-27-2017 09:16 AM
In my opinion both are two different tasks, and it is upto you to do it in parallel
also the copyToLocal should work even with under replicated block as only one replication out of three (if you follow default 3 replication) will be copied to local
also not sure i am missing something here, but wondering why you are using copyToLocal option for backup instead of Cloudera Manager -> Backup (menu) -> Replication schedule -> Create schedule (pre-request: set the peer) option, it may reduce your work
12-27-2017 10:00 PM - edited 12-27-2017 11:16 PM
Currently we dont have cloudera manager. so any other alternate method other than copyToLocal?. and also we have mandatory to enable the Safemode on for doing the copyTolocal?
12-28-2017 08:08 AM
I don't think it is mandatory to enable the safemode during copyToLocal. May be you can use safemode to make sure nobody is updating/deleting/inserting the data during the data copy.
I know the difficulties without cloudera manager/hortonworks, etc
Long back, i've used the below export/import method for Hive table backup, again this will export the data to HDFS and you still have to use copyTolocal. The advantage is, it will also take care of metadata
You can use these options as temporary solution but once you start using cloudera manager or any other management tool, I would recommend to use the backup option that i've mentioned earlier