- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Is it possible to use trash in HDFS encryption zone?
- Labels:
-
HDFS
Created on ‎01-24-2016 07:18 PM - edited ‎09-16-2022 02:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using -skipTrash to delete an HDFS file from encryption zone. Is there any way that I can use trash to recover a deleted file from encryption zone?
Created ‎01-25-2016 01:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
-skipTrash will permanently remove a file unless you have a snapshot
referencing it. If you want to use the trash ability, you need to use
'hadoop fs -rm' without -skipTrash.
Encryption zones merely create the blocks with encrypted data and associate
keys with it. Other HDFS behaviour remains the same, with the exception
being that you cannot move a file within one EZ to another, or move it
outside of the EZ.
Created ‎01-27-2016 08:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello naveen1,
For a file in encrypted zone a LOGICAL workaround could be to introduce .trash folder in same zone. This folder can be used as destination to move the redundant file(s) before removing them permanently.
In a scenario where full encrypted zone is deleted, MAY use trash bin if trash feature is enable (Please test before you implement. I haven't tested it yet.)
Hope that helps.
Created ‎01-28-2016 12:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The effective issue is this:
When you use 'fs -rm' with trash enabled, we move the file to the authenticated user's /user/{user.name}/.Trash sub-directory. For ex., if the path of deletion is '/data/myapp/part-00000.gz', and the user you delete it as is 'hive', then the trash feature moves it to directory '/user/hive/.Trash/Current/'.
When encrypted zones come into play, HDFS disallows you from moving a file from one Encrypted Zone to another Encrypted Zone, as well as from within an Encrypted Zone to a non-Encrypted Zone. This is for security reasons, and ties into how the encryption zone features of HDFS are managed globally within a directory (zone), vs. arbitrary files holding all of the necessary info independently.
So if /data/ is an EZ, but /user/hive is not, or is a separate EZ, then the trash moving will fail expectedly.
But if / is the EZ, then the moves may work, since both paths come under it.
What Consult proposes is a manual step (i.e. use hadoop fs -mv instead of hadoop fs -rm), and keep a manually created /data/.Trash directory to move the files into, followed by scripts to periodically clean it (i.e. Bring-Your-Own-Trash). Its not a great solution but its what may work if you need some data retention.
Another option is to consider using limited and periodic snapshots (via BDR, etc.), which give you similar (but not exactly the same) data retention capabilities.
Created ‎01-25-2016 01:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
-skipTrash will permanently remove a file unless you have a snapshot
referencing it. If you want to use the trash ability, you need to use
'hadoop fs -rm' without -skipTrash.
Encryption zones merely create the blocks with encrypted data and associate
keys with it. Other HDFS behaviour remains the same, with the exception
being that you cannot move a file within one EZ to another, or move it
outside of the EZ.
Created ‎01-25-2016 06:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I didnt mean to use skipTrash, I was suggested to use that as I couldnt delete a file from encryption zone. If there is any way to use trash for encryption zone please let me know.
Created ‎01-27-2016 08:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello naveen1,
For a file in encrypted zone a LOGICAL workaround could be to introduce .trash folder in same zone. This folder can be used as destination to move the redundant file(s) before removing them permanently.
In a scenario where full encrypted zone is deleted, MAY use trash bin if trash feature is enable (Please test before you implement. I haven't tested it yet.)
Hope that helps.
Created ‎01-25-2016 08:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I had the same question too.
When we try to delete something from encryption zone, it says the directory cannot be moved to /user/xyz/.trash.
So we have to forcefully use -skipTrash option.
So, is there a way to delete a file from encyption zone without using -skipTrash option?
Created ‎01-27-2016 08:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello SiddeshSamarth,
Currently, there is no option to send encrypted files to global bin for obvious reasons.
Created ‎01-27-2016 08:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎01-28-2016 12:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello SiddeshSamarth,
For a file in encrypted zone a LOGICAL workaround could be to introduce .trash folder in same zone. This folder can be used as destination to move the redundant file(s) before removing them permanently.
Created ‎01-28-2016 12:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The effective issue is this:
When you use 'fs -rm' with trash enabled, we move the file to the authenticated user's /user/{user.name}/.Trash sub-directory. For ex., if the path of deletion is '/data/myapp/part-00000.gz', and the user you delete it as is 'hive', then the trash feature moves it to directory '/user/hive/.Trash/Current/'.
When encrypted zones come into play, HDFS disallows you from moving a file from one Encrypted Zone to another Encrypted Zone, as well as from within an Encrypted Zone to a non-Encrypted Zone. This is for security reasons, and ties into how the encryption zone features of HDFS are managed globally within a directory (zone), vs. arbitrary files holding all of the necessary info independently.
So if /data/ is an EZ, but /user/hive is not, or is a separate EZ, then the trash moving will fail expectedly.
But if / is the EZ, then the moves may work, since both paths come under it.
What Consult proposes is a manual step (i.e. use hadoop fs -mv instead of hadoop fs -rm), and keep a manually created /data/.Trash directory to move the files into, followed by scripts to periodically clean it (i.e. Bring-Your-Own-Trash). Its not a great solution but its what may work if you need some data retention.
Another option is to consider using limited and periodic snapshots (via BDR, etc.), which give you similar (but not exactly the same) data retention capabilities.
Created ‎01-28-2016 05:39 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you all for your time, logical workaround sounds good to me.
