Created 03-29-2017 08:04 AM
I started using HDP sandbox 2.5 - and currently I'm puzzled with, which looks like this:
0. I use very newly set up distribution, with everything set to defaults, no properties changed etc...
1. I put some large files to hdfs
2. I remove them (without sending them to trash)
3. I see that total directory sizes are as before step 1, but available free space is reduced
4. Eventually this leads to hdfs becoming full and no apparent way to clear it 😞
5. This is not reproduced on my standalone install of hadoop / hdfs on ubuntu - here the remaining space returns to normal after deletion.
This sequence of steps looks like this in a command line:
# hdfs dfs -df ... shows 7% usage # hdfs dfs -put large-data-directory /large-data-directory # hdfs dfs -df ... shows 95% usage # hdfs dfs -rm -r -skipTrash /large-data-directory # hdfs dfs -du /user/root ... to make sure nothing sticks in /user/root/.Trash # hdfs dfs -df ... still shows 95% usage
So could anyone please enlighten me on how this can be fixed? I haven't found any properties yet which may cause such behavior...
UPD
> then after how much time did you run the "hdfs dfs -du /user/root" command? (immediately or a few seconds/minutes later)
about 3 minutes later (and the result doesn't change after about a day)
> Is it still consuming the same 95% usages (even after a long time?)
yes, 24 hours don't change anything 😞
> fs.trash.interval
It was 360 when testing. Later I changed it to 0 but it does not seem to help.
> fs.trash.checkpoint.interval
This is not set in configs and I did not add it, so I believe it should be at default value?
UPD2 dfsadmin -report could be seen at this github gist: https://gist.github.com/anonymous/4e6c81c920700251aad1d33748afb29d
Created 03-29-2017 08:42 AM
- When you deleted the large data directory from HDFS then after how much time did you run the "hdfs dfs -du /user/root" command? (immediately or a few seconds/minutes later)
- Also what does the following command show?
# su - hdfs -c "hdfs dfsadmin -report"
- Is it still consuming the same 95% usages (even after a long time?)
- Although you are using "skipTrash", By any change have you altered any of the following parameter value:
---> Deletion interval specifies how long (in minutes) a checkpoint will be expired before it is deleted. It is the value of fs.trash.interval. The NameNode runs a thread to periodically remove expired checkpoints from the file system.
---> Emptier interval specifies how long (in minutes) the NameNode waits before running a thread to manage checkpoints. The NameNode deletes checkpoints that are older than fs.trash.interval and creates a new checkpoint from /user/${username}/.Trash/Current. This frequency is determined by the value of fs.trash.checkpoint.interval, and it must not be greater than the deletion interval. This ensures that in an emptier window, there are one or more checkpoints in the trash.
Created 03-29-2017 09:17 AM
Thanks for response! I've updated my post with answers to your questions!