Support Questions

Find answers, ask questions, and share your expertise

How to create & recover a snapshot

avatar

Hi guys,
I have been using AWS machine for practice.My Question is :
How i Create & Recover Snapshot ?
Please explain.
Thanks,
Mudassar Hussain

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Mudassar Hussain

The steps 1,2 and 3 are okay.You dont need to create directory2, when you enable a directory as snapshottable in this case /user/mudassar/snapdemo the snapshots will be created under this directory with .snapshot....... which makes it invisible when you run the hdfs dfs -ls command

Let me demo on HDP 2.6

I will create your user on my local environment.

As mudassar as the root user

# adduser mudassar 

Switch to superuser and owner of HDFS

# su - hdfs 

Create the snapshot demo notice the -p option as the root directory /user/mudassar doesn't exist yet Note: hdfs hadoop will be deprecated so use hdfs dfs command!

$ hdfs dfs -mkdir -p /user/mudassar/snapdemo 

Validate directory

$ hdfs dfs -ls /user/mudassar 
Found 1 items 
drwxr-xr-x - hdfs hdfs 0 2018-03-15 09:39 /user/mudassar/snapdemo 

Change ownership to mudassar

$ hdfs dfs -chown mudassar /user/mudassar/snapdemo 

Validate change of ownership

$ hdfs dfs -ls /user/mudassar 
Found 1 items 
drwxr-xr-x - mudassar hdfs 0 2018-03-15 09:39 /user/mudassar/snapdemo 

Make the directory snapshottable

$ hdfs dfsadmin -allowSnapshot /user/mudassar/snapdemo 
Allowing snaphot on /user/mudassar/snapdemo succeeded

Show all the snapshottable directories in your cluster a subcommand under hdfs

$ hdfs lsSnapshottableDir 
drwxr-xr-x 0 mudassar hdfs 0 2018-03-15 09:39 0 65536 /user/mudassar/snapdemo 

Create 2 sample files in /tmp

$ echo "Test one for snaphot No worries No worries I was worried you got stuck and didn't revert the HCC is full of solutions so" > /tmp/text1.txt 
$ echo "The default behavior is that only a superuser is allowed to access all the resources of the Kafka cluster, and no other user can access those resources" > /tmp/text2.txt 

Validate the files were created

$cd /tmp 
$ls -lrt 
-rw-r--r-- 1 hdfs hadoop 121 Mar 15 10:04 text1.txt 
-rw-r--r-- 1 hdfs hadoop 152 Mar 15 10:04 text2.txt 

Copy the files from locall to HDFS

$ hdfs dfs -put text1.txt /user/mudassar/snapdemo 

Create a snapshot of the file text1.txt

$ hdfs dfs -createSnapshot /user/mudassar/snapdemo 
Created snapshot /user/mudassar/snapdemo/.snapshot/s20180315-101148.262 

Note above the .snapshot directory which is a hidden system directory

Show the snapshot of text1.txt

$ hdfs dfs -ls /user/mudassar/snapdemo/.snapshot/s20180315-101148.262 
Found 1 items 
-rw-r--r-- 3 hdfs hdfs 121 2018-03-15 10:10 /user/mudassar/snapdemo/.snapshot/s20180315-101148.262/text1.txt 

Copied the second file text2.txt from local /tmp to HDFS

$ hdfs dfs -put text2.txt /user/mudassar/snapdemo 

Validation that the 2 files should be resent

$ hdfs dfs -ls /user/mudassar/snapdemo 
Found 2 items 
-rw-r--r-- 3 hdfs hdfs 121 2018-03-15 10:10 /user/mudassar/snapdemo/text1.txt 
-rw-r--r-- 3 hdfs hdfs 152 2018-03-15 10:19 /user/mudassar/snapdemo/text2.txt 

Demo simulate loss of file text1.txt

$ hdfs dfs -rm /user/mudassar/snapdemo/text1.txt 

Indeed file text1.txt was deleted ONLY text2.txt remains

$ hdfs dfs -ls /user/mudassar/snapdemo 
Found 1 items 
-rw-r--r-- 3 hdfs hdfs 152 2018-03-15 10:19 /user/mudassar/snapdemo/text2.txt 

Restore the text1.txt

$ hdfs dfs -cp -ptopax /user/mudassar/snapdemo/.snapshot/s20180315-101148.262/text1.txt /user/mudassar/snapdemo 

To use -ptopax this ensure the timestamp is restored you will need to set the dfs.namenode.accesstime.precision to default 1 hr which is 360000 seconds

Check the original timestamp for text1.txt above !!!

hdfs dfs -ls /user/mudassar/snapdemo 
Found 2 items 
-rw-r--r-- 3 hdfs hdfs 121 2018-03-15 10:10 /user/mudassar/snapdemo/text1.txt 
-rw-r--r-- 3 hdfs hdfs 152 2018-03-15 10:19 /user/mudassar/snapdemo/text2.txt 

In a nutshell, you don't need to create directory2 because when you run hdfs dfs -createSnapshot command it autocreates a directory under the original starting with .snapshot, that also saves you from extra steps of creating sort of a backup directory.


I hope that explains it clearly this time

View solution in original post

11 REPLIES 11

avatar
Master Mentor

@Mudassar Hussain

I am positive that command should and will work without fail if you have successfully created a snapshottable directory. Its a sub command of hdfs can you simply run hdfs a the hdfs user?

$ hdfs Usage: hdfs [--config confdir] [--loglevel loglevel] COMMAND 
where COMMAND is one of: 
dfs 			run a filesystem command on the file systems supported in Hadoop. 
classpath 		prints the classpath 
namenode -format 	format the DFS filesystem 
secondarynamenode 	run the DFS secondary namenode 
namenode 		run the DFS namenode 
journalnode 		run the DFS journalnode 
.......
........
snapshotDiff 		diff two snapshots of a directory or diff the current directory contents  a snapshot 
lsSnapshottableDir 	list all snapshottable dirs owned by the current user 
Use -help to see options 
....

Most commands print help when invoked w/o parameters.

Now once you have confirmed the above run as below

# su - hdfs
$ hdfs lsSnapshottableDir 
output ..........................
drwxr-xr-x 0 mudassar hdfs 0 2018-03-15 10:38 1 65536 /user/mudassar/snapdemo 

That the directory I created to reproduce your issues on my cluster.

avatar

@Geoffrey Shelton Okot
I have successfully created and recover snapshot directory.
Please see the below screen :

62935-snapshotdir.jpg
and another question is :
For example if do not remembered the snapshot details number which is something like that "s20180316-053628.936" ?
is the another way to recover the snapshot without this detail ?
Thanks