Member since
06-02-2017
27
Posts
1
Kudos Received
0
Solutions
07-19-2017
05:33 PM
I am trying to figure out the difference between df -h in linux and hdfs dfs -df -h / output. dfs output showing 44Tb across all cluster but root partition is not having the data it is always showing very less usage on each data node. hdfs@hdfs-xxxxx-xxxxxxx:~$ hdfs dfs -df -h /
Filesystem Size Used Available Use%
hdfs://hdfs-xxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:8020 44.8 T 20.6 T 6.8 T 46%
hdfs@hdfs-xxxxxx-xxxxxx:~$
where as df -h
hdfs@hdfs-XXXXX-XXXX:~$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 32G 12K 32G 1% /dev
tmpfs 6.3G 320K 6.3G 1% /run
/dev/sda1 2.0T 8.8G 1.9T 1% /
XXXXXXXXXXXXXXXXXXXXXXX 2.0T 1.8T 142G 93% /hadoop
hdfs@hdfs-XXXXX-XXXXX:~$
dfsadmin report
==================
Decommission Status : Normal
Configured Capacity: 3785823326208 (3.44 TB)
DFS Used: 1893513980859 (1.72 TB)
Non DFS Used: 1370250273861 (1.25 TB)
DFS Remaining: 302107447808 (281.36 GB)
DFS Used%: 50.02%
DFS Remaining%: 7.98%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
May I know where is the utilization is going on and why it is not showing on / partition.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark
07-19-2017
07:12 AM
@dsun I went through the above url already, came to know that we have configured 270G for dfs reserved space but it is taking Tb's on some of the servers, After analyzing the server we realized that most of the space is going to Map Reduce jobs. Is there any best tool/process to analyse to more.
... View more
07-18-2017
12:53 PM
@dsun Thanks for your suggestions, I tried to find the which directories are taking high amount of space, logs are not taking hig amount of space still Non DFS usage is very high in Tb's Thanks
... View more
07-17-2017
10:50 AM
Hello, We have 14 node cluster for hdp. Few servers have 4Tb diskspace and few servers have 2Tb diskspace. Using 14 nodes we got 44.8 TB diskspace for HDFS, In this Disk Usage (Non DFS Used)12.1 TB / 44.8 TB (27.04%). By this we are losing more amount of space with out keeping the data. I came to know that we can increase the DFS space to higher level by changing "Reserved space for HDFS" in Ambari config. Right now the value is "270415951872 bytes", what will be the value to get good amount of space. Is it necessary to keep 30% of space under Non DFS. Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
06-26-2017
07:53 AM
@Kshitij Badani
Attached note.json here, I have removed some confidential information from this. We generally remove Results section if we are unable to open the notebook.note.zip
... View more
06-22-2017
06:18 AM
Thanks @Dongjoon Hyun
... View more
06-21-2017
07:27 AM
@Kshitij Badani Data lost means - If I copy some lines from another notebook and paste in new notebook and didn't execute the those paragrahs for some time, the copied lines are gone. If I execute those lines it will be save. If notebook have more errors, it notedown each and every thing to note.json file right? when this file have more entries, we are unable to open the notebook. For that, we removed the error entries from note.json and able to open it. We would like to know is there any permanent fix for this issue. May I know when will be zeppelin 0.7.2 version release?
... View more
06-19-2017
02:27 PM
1 Kudo
Hello, May I know in which section I need to add this parameter. spark.sql.broadcastTimeout Thanks
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Spark
06-19-2017
09:46 AM
Hello, I am losing the data continuously if I doesn't execute the paragraphs, I would like to enable the auto save for each and every paragraphs for every 2-5 minutes(even if I doesn't execute the paragraph). 2. Unable to open the notebook if notebook have more errors in note.json, for temparory fix wise I am editing the note.json file and removing those errors. Is there a way to fix this permanently. Thanks
... View more
Labels:
- Labels:
-
Apache Zeppelin