About rki_

rki_ · ‎10-04-2023

@Noel_0317 The directory /hadoop/dfs/name/ might be your Namenode data directory that contains the metadata in the form of fsimage and edits. So won't recommend deleting it if that's the case. You can confirm if this directory is indeed the NN data directory by checking the HDFS configuration. If the cluster is working and still taking writes, you can verify if the Namenode Data dir has been changed to a different mount point if the latest data available on it is from July.

rki_ · ‎09-29-2023

@Noel_0317 If you want to know how the Datanode got upto 705GB, you will need to do a du at the Linux filesystem level for the datanode blockpool. For ex : du -s -h /data/dfs/dn/current/BP-331341740-172.25.35.200-1680172307700/ /data/dfs/dn/ ==> Datanode data dir BP ==> Blockpool used by the datanode The above should return 705GB. The Blockpool will contain the subdir which would be holding the File blocks present on this specific datanode. When you run 'hdfs dfs -du' it takes the entire HDFS storage into account.

rki_ · ‎09-13-2023

@newtocm you can't pause the Balancer. You can kill it and start it again and it will try to balance the rest of the DFS data remaining to be balanced.

rki_ · ‎08-25-2023

Generally, the size of data blocks would be 128mb across all the Datanodes. But, if you have small files then you might see smaller blocks on some Datanodes as well. So Datanodes with different disk spaces would have uneven "Number of Blocks" and the Balancing happens based on the difference in the DFS usage and not by the difference in block count.

rki_ · ‎08-24-2023

Did you make any changes at the KDC end prior to seeing this issue? Are there any other services hosted on this node that are working fine?

rki_ · ‎08-24-2023

@Crash you can set up load balance at a disk level for the Datanodes. Refer : https://docs.cloudera.com/documentation/enterprise/5-16-x/topics/admin_dn_storage_balancing.html

rki_ · ‎08-17-2023

Hi @Ben1996, As this is very specific use case, I don't see any Cloudera doc which could cover this. The basic requirements would be to import the hdfs module and provide the HDFS configs. For ex, https://tahiriamine9.medium.com/python-hdfs-cd822199799e#:~:text=Let's%20have%20an%20example%20of,into%20HDFS%20with%20CSV%20format.

rki_ · ‎08-07-2023

I would suggest keeping one jar file at a time and checking. 1) /home/hadoop/hbase/hbase-2.5.5/lib/hbase-shaded-netty-4.1.4.jar!/META-INF/native/liborg_apache_hbase_thirdparty_netty_transport_native_epoll_x86_64.so 2) /home/hadoop/hbase/hbase-2.5.5/lib/phoenix-client-hbase-2.5-5.1.3.jar!/META-INF/native/liborg_apache_hbase_thirdparty_netty_transport_native_epoll_x86_64.so

rki_ · ‎08-04-2023

https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/phoenix_installation.html#concept_ysq_t4n_c3b

rki_ · ‎08-04-2023

@Noel_0317 Do you have rack awareness configured for the Datanodes? Also, check for any disk-level issues on the datanode. Try enabling Debug for block placement : log4j.logger.org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy=DEBUG log4j.logger.org.apache.hadoop.hdfs.protocol.BlockStoragePolicy=DEBUG

Online	Offline
Last Visited	‎12-17-2024 11:03 AM

Member Since	‎07-30-2020 02:04 AM
Last Visited	‎12-17-2024 11:03 AM
Posts	219
Kudos received	44

Cloudera Community

Re: Restore data from datanode after doing hdfs na...

Re: HBase "Master is initializing" error in pseudo...

Re: After upgrading Cloudera Manager to 7.11.3, Li...

Re: Can HDFS Rebalancer run without interrupted Pr...

Re: CM-HDFS

Re: Namenode fs utilization

Re: hdfs-datanode size

Re: Can HDFS Rebalancer run without interrupted Pr...

Re: CM-HDFS

Re: Unable to register Impalad: Caused by: KrbExce...

Re: CM-HDFS

Re: How to write to HDFS remotely using pandas

Re: Does Apache Phoenix run on distributed mode or...

Re: Does Apache Phoenix run on distributed mode or...

Re: Datanode low number of blocks