Member since
03-29-2017
2
Posts
0
Kudos Received
0
Solutions
03-29-2017
09:17 AM
Thanks for response! I've updated my post with answers to your questions!
... View more
03-29-2017
08:04 AM
I started using HDP sandbox 2.5 - and currently I'm puzzled with, which looks like this: 0. I use very newly set up distribution, with everything set to defaults, no properties changed etc...
1. I put some large files to hdfs
2. I remove them (without sending them to trash)
3. I see that total directory sizes are as before step 1, but available free space is reduced
4. Eventually this leads to hdfs becoming full and no apparent way to clear it 😞
5. This is not reproduced on my standalone install of hadoop / hdfs on ubuntu - here the remaining space returns to normal after deletion.
This sequence of steps looks like this in a command line:
# hdfs dfs -df
... shows 7% usage
# hdfs dfs -put large-data-directory /large-data-directory
# hdfs dfs -df
... shows 95% usage
# hdfs dfs -rm -r -skipTrash /large-data-directory
# hdfs dfs -du /user/root
... to make sure nothing sticks in /user/root/.Trash
# hdfs dfs -df
... still shows 95% usage
So could anyone please enlighten me on how this can be fixed? I haven't found any properties yet which may cause such behavior... UPD > then after how much time did you run the "hdfs dfs -du /user/root" command? (immediately or a few seconds/minutes later) about 3 minutes later (and the result doesn't change after about a day) > Is it still consuming the same 95% usages (even after a long time?) yes, 24 hours don't change anything 😞 > fs.trash.interval It was 360 when testing. Later I changed it to 0 but it does not seem to help. > fs.trash.checkpoint.interval This is not set in configs and I did not add it, so I believe it should be at default value? UPD2 dfsadmin -report could be seen at this github gist: https://gist.github.com/anonymous/4e6c81c920700251aad1d33748afb29d
... View more
Labels:
- Labels:
-
Apache Hadoop