Created 03-08-2018 06:11 AM
Hi guys,
I have been using AWS machine for practice.My Question is :
How i Create & Recover Snapshot ?
Please explain.
Thanks,
Mudassar Hussain
Created 03-15-2018 10:05 AM
The steps 1,2 and 3 are okay.You dont need to create directory2, when you enable a directory as snapshottable in this case /user/mudassar/snapdemo the snapshots will be created under this directory with .snapshot....... which makes it invisible when you run the hdfs dfs -ls command
Let me demo on HDP 2.6
I will create your user on my local environment.
As mudassar as the root user
# adduser mudassar
Switch to superuser and owner of HDFS
# su - hdfs
Create the snapshot demo notice the -p option as the root directory /user/mudassar doesn't exist yet Note: hdfs hadoop will be deprecated so use hdfs dfs command!
$ hdfs dfs -mkdir -p /user/mudassar/snapdemo
Validate directory
$ hdfs dfs -ls /user/mudassar Found 1 items drwxr-xr-x - hdfs hdfs 0 2018-03-15 09:39 /user/mudassar/snapdemo
Change ownership to mudassar
$ hdfs dfs -chown mudassar /user/mudassar/snapdemo
Validate change of ownership
$ hdfs dfs -ls /user/mudassar Found 1 items drwxr-xr-x - mudassar hdfs 0 2018-03-15 09:39 /user/mudassar/snapdemo
Make the directory snapshottable
$ hdfs dfsadmin -allowSnapshot /user/mudassar/snapdemo Allowing snaphot on /user/mudassar/snapdemo succeeded
Show all the snapshottable directories in your cluster a subcommand under hdfs
$ hdfs lsSnapshottableDir drwxr-xr-x 0 mudassar hdfs 0 2018-03-15 09:39 0 65536 /user/mudassar/snapdemo
Create 2 sample files in /tmp
$ echo "Test one for snaphot No worries No worries I was worried you got stuck and didn't revert the HCC is full of solutions so" > /tmp/text1.txt $ echo "The default behavior is that only a superuser is allowed to access all the resources of the Kafka cluster, and no other user can access those resources" > /tmp/text2.txt
Validate the files were created
$cd /tmp $ls -lrt -rw-r--r-- 1 hdfs hadoop 121 Mar 15 10:04 text1.txt -rw-r--r-- 1 hdfs hadoop 152 Mar 15 10:04 text2.txt
Copy the files from locall to HDFS
$ hdfs dfs -put text1.txt /user/mudassar/snapdemo
Create a snapshot of the file text1.txt
$ hdfs dfs -createSnapshot /user/mudassar/snapdemo Created snapshot /user/mudassar/snapdemo/.snapshot/s20180315-101148.262
Note above the .snapshot directory which is a hidden system directory
Show the snapshot of text1.txt
$ hdfs dfs -ls /user/mudassar/snapdemo/.snapshot/s20180315-101148.262 Found 1 items -rw-r--r-- 3 hdfs hdfs 121 2018-03-15 10:10 /user/mudassar/snapdemo/.snapshot/s20180315-101148.262/text1.txt
Copied the second file text2.txt from local /tmp to HDFS
$ hdfs dfs -put text2.txt /user/mudassar/snapdemo
Validation that the 2 files should be resent
$ hdfs dfs -ls /user/mudassar/snapdemo Found 2 items -rw-r--r-- 3 hdfs hdfs 121 2018-03-15 10:10 /user/mudassar/snapdemo/text1.txt -rw-r--r-- 3 hdfs hdfs 152 2018-03-15 10:19 /user/mudassar/snapdemo/text2.txt
Demo simulate loss of file text1.txt
$ hdfs dfs -rm /user/mudassar/snapdemo/text1.txt
Indeed file text1.txt was deleted ONLY text2.txt remains
$ hdfs dfs -ls /user/mudassar/snapdemo Found 1 items -rw-r--r-- 3 hdfs hdfs 152 2018-03-15 10:19 /user/mudassar/snapdemo/text2.txt
Restore the text1.txt
$ hdfs dfs -cp -ptopax /user/mudassar/snapdemo/.snapshot/s20180315-101148.262/text1.txt /user/mudassar/snapdemo
To use -ptopax this ensure the timestamp is restored you will need to set the dfs.namenode.accesstime.precision to default 1 hr which is 360000 seconds
Check the original timestamp for text1.txt above !!!
hdfs dfs -ls /user/mudassar/snapdemo Found 2 items -rw-r--r-- 3 hdfs hdfs 121 2018-03-15 10:10 /user/mudassar/snapdemo/text1.txt -rw-r--r-- 3 hdfs hdfs 152 2018-03-15 10:19 /user/mudassar/snapdemo/text2.txt
In a nutshell, you don't need to create directory2 because when you run hdfs dfs -createSnapshot command it autocreates a directory under the original starting with .snapshot, that also saves you from extra steps of creating sort of a backup directory.
I hope that explains it clearly this time
Created 03-15-2018 02:43 PM
I am positive that command should and will work without fail if you have successfully created a snapshottable directory. Its a sub command of hdfs can you simply run hdfs a the hdfs user?
$ hdfs Usage: hdfs [--config confdir] [--loglevel loglevel] COMMAND where COMMAND is one of: dfs run a filesystem command on the file systems supported in Hadoop. classpath prints the classpath namenode -format format the DFS filesystem secondarynamenode run the DFS secondary namenode namenode run the DFS namenode journalnode run the DFS journalnode ....... ........ snapshotDiff diff two snapshots of a directory or diff the current directory contents a snapshot lsSnapshottableDir list all snapshottable dirs owned by the current user Use -help to see options ....
Most commands print help when invoked w/o parameters.
Now once you have confirmed the above run as below
# su - hdfs $ hdfs lsSnapshottableDir output .......................... drwxr-xr-x 0 mudassar hdfs 0 2018-03-15 10:38 1 65536 /user/mudassar/snapdemo
That the directory I created to reproduce your issues on my cluster.
Created on 03-16-2018 06:09 AM - edited 08-18-2019 03:00 AM
@Geoffrey Shelton Okot
I have successfully created and recover snapshot directory.
Please see the below screen :
and another question is :
For example if do not remembered the snapshot details number which is something like that "s20180316-053628.936" ?
is the another way to recover the snapshot without this detail ?
Thanks