About jonathan1

jonathan1 · ‎04-12-2017

Hi @Bala Vignesh N V I've finally solved the problem by using the blocksize parameter in the HTTP request. By setting the blocksize to a lower value, the system doesn't overload. I guess it was because the system created temporarily blocks of 64Mb with 5Mb of data. After a while the non-HDFS was overloaded and could not create more temporarily blocks. I hope I'm clear enough.

jonathan1 · ‎04-07-2017

@Bala Vignesh N V Thank you for your explanation. You seem to know HDFS pretty well, I take this opportunity to ask you something else (but related). I'm trying to write files on a HDFS using the webhdfs RESTAPI part by part. When I define a small part (~5MB), I can see the remaining disk space decreases in relation to my upload. However the non-HDFS is also consumed while uploading but much faster. Because of that, the non-DFS reaches 0% and the upload stops. After the upload, the non-HDFS increases and reach 18.7GB again... Here are some data : File to upload : 2.2GB / Remaining : 9,9GB / non-HDFS used : 18,7GB Surprisingly, the non-HDFS used reaches 0GB while I upload a 2.2GB file. It doesn't decrease so much when I define a larger part (~50MB). Is it a cache problem ? I tried to use the "buffersize" in my request (corresponding to the part size) but it doesn't seem to change anything.

jonathan1 · ‎04-07-2017

Hi @Bala Vignesh N V That's strange, when I started my virtual machine today, the disk usage has been reclaimed and I got 10GB back. I guess it reached the trash time interval which was set on 360 minutes. However, I thought emptying the bin doesn't use this configuration. When running your command I get : [root@sandbox ~]# du -hsx * | sort -rh | head -10 368K blueprint.json 12K jce_policy-8.zip 8.0K install.log 4.0K sandbox.info 4.0K install.log.syslog 4.0K hdp 4.0K build.out 4.0K anaconda-ks.cfg 0 start_hbase.sh 0 start_ambari.sh [root@sandbox ~]# So I guess the non-DFS used is just reserved space.

jonathan1 · ‎04-06-2017

Hi, I'm running the sandbox on a VirtualBox virtual machine, this is a single-node cluster with a replication factor of 1. After deleting files in the Hadoop file system and removing them from the trash, I don't get disk space back even after waiting for a while. I tried to use: [hdfs@sandbox ~]$ hadoop fs -expunge [hdfs@sandbox ~]$ When I use hdfs dfsadmin -report, I get: [hdfs@sandbox ~]$ hdfs dfsadmin -report Configured Capacity: 45103345664 (42.01 GB) Present Capacity: 25068261376 (23.35 GB) DFS Remaining: 2002014208 (1.86 GB) DFS Used: 23066247168 (21.48 GB) DFS Used%: 92.01% Under replicated blocks: 70 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 ------------------------------------------------- Live datanodes (1): Name: 172.17.0.2:50010 (sandbox.hortonworks.com) Hostname: sandbox.hortonworks.com Decommission Status : Normal Configured Capacity: 45103345664 (42.01 GB) DFS Used: 23066247168 (21.48 GB) Non DFS Used: 20035084288 (18.66 GB) DFS Remaining: 2002014208 (1.86 GB) DFS Used%: 51.14% DFS Remaining%: 4.44% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 4 Last contact: Thu Apr 06 13:36:57 UTC 2017 As you can see, it says that I use 21.48 GB. However, when I execute this other command I get a total of ~11.4GB [hdfs@sandbox ~]$ hdfs dfs -du -h / 0 /app-logs 181.2 M /apps 0 /ats 9.5 G /demo 869.1 M /hdp 0 /mapred 0 /mr-history 269.2 M /ranger 6.0 K /spark-history 24.9 K /spark2-history 8.2 K /tmp 656.4 M /user [hdfs@sandbox ~]$ The disk usage is the same as before the deletion I found a topic about the same issue . However, I don't have any snapshots. [hdfs@sandbox ~]$ hdfs lsSnapshottableDir [hdfs@sandbox ~]$ How could I reclaim this disk usage ?

Online	Offline
Last Visited	‎04-12-2017 12:51 PM

Member Since	‎03-01-2017 02:02 PM
Last Visited	‎04-12-2017 12:51 PM
Posts	5
Kudos received	1

Cloudera Community

Re: Removing files in HDFS does not free up space

Re: Removing files in HDFS does not free up space

Re: Removing files in HDFS does not free up space

Removing files in HDFS does not free up space