About Shelton

Masood · ‎11-25-2020

@ni4ni mentions, There is no standby/secondary NN.

kvinod · ‎11-20-2020

Hello Team, Can anyone please help me with your comments. Thanks, Vinod

PabitraDas · ‎11-11-2020

Hello @Amn_468 Since you reported the DN Pause time, I spoke/referred about DN heap only. The block counts on most of the DN seems >6Millions, hence would suggest to increase the DN heap to 8GB (from current value of 6GB) and perorm a rolling restart to bring the new heap size into effect. There is no straight forward way to say you hit the small file problem but if your average block size is few MB or less than a MB in size, it is an indication that you are storing/accumulating small files in HDFS. Simplest way to determine small files in cluster is to run fsck. Fsck should show the average block size. If it's too low a value (eg ~ 1MB ), you might be hitting the problems of small files which would be worth looking at, otherwise, there is no need to review the number of blocks. [..] $ hdfs fsck / .. ... Total blocks (validated): 2899 (avg. block size 11475601 B) <<<<< [..] You may refer belwo links for your help on dealing with small files. - https://blog.cloudera.com/small-files-big-foils-addressing-the-associated-metadata-and-application-challenges/ - https://community.cloudera.com/t5/Community-Articles/Identify-where-most-of-the-small-file-are-located-in-a-large/ta-p/247253

PabitraDas · ‎11-09-2020

Hello @Masood, I believe you are asking the commands to run in order to determine the active NN apart from CM UI ( CM > HDFS > Instance > NameNode) From CLI you have to run couple of commands to detemrine the Active/Standby NN List the namenode hostnames # hdfs getconf -namenodes c2301-node2.coelab.cloudera.com c2301-node3.coelab.cloudera.com Get nameservice name # hdfs getconf -confKey dfs.nameservices nameservice1 Get active and standby namenodes # hdfs getconf -confKey dfs.ha.namenodes.nameservice1 namenode11,namenode20 # su - hdfs $ hdfs haadmin -getServiceState namenode11 active $ hdfs haadmin -getServiceState namenode20 standby Get active and standby namenode hostnames $ hdfs getconf -confKey dfs.namenode.rpc-address.nameservice1.namenode11 c2301-node2.coelab.cloudera.com:8020 $ hdfs getconf -confKey dfs.namenode.rpc-address.nameservice1.namenode20 c2301-node3.coelab.cloudera.com:8020 If you want to get the active namenode hostname from hdfs-site.xml file, you can go through following python script in github – https://github.com/grakala/getActiveNN. Thank you

PabitraDas · ‎11-09-2020

Hello @AlexP Ref: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#setrep Referring to HDFS document, answers to your questions are inline. [Q1.] How to estimate how much time would this command take for a single directory (without -w)? [A1.] It depends upon the numbr of files in the directory. If you are running setrep against a path which is a directory, then the command recursively changes the replication factor of all files under the directory tree rooted at path. The time varies dependsing on the file count under the path/directory. [Q2.] Will it trigger a replication job even if I don't use the '-w' flag? [A2.] Yes, replication will trigger without -w flag. However, it is good practice to use -w to ensure all files are having required replication factor set prior to command exits. Please note, the -w flag requests that the command wait for the replication to complete. Though use of -w potentially takes a long time to complete the command but it gurantees the replication factor changed to the specified value. [Q3.] If yes, does it mean that the NameNode will actually start deleting 'over-replicated' blocks of all existing files under a particular directory? [A3.] Yes, your understanding is correct. The additonal 1 replica of the block will mark the block as over-replicated and same will be deleted from cluster. This action will be performed for each files under the directory path keeping only 2 replicas of the file blocks. Hope this helps.

jlguti · ‎11-08-2020

@Shelton # Your system has configured 'manage_etc_hosts' as True. # As a result, if you wish for changes to this file to persist # then you will need to either # a.) make changes to the master file in /etc/cloud/templates/hosts.debian.tmpl # b.) change or remove the value of 'manage_etc_hosts' in # /etc/cloud/cloud.cfg or cloud-config from user-data # <master-ip> master <hostname>-00 <slave-ip> slave01 <hostname>-01 # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts For the yarn-site.xml file in the hadoop folder this is one of the configs <property> <name>yarn.resourcemanager.hostname</name> <value><master-ip></value> </property>

jlguti · ‎11-07-2020

Hello @Shelton, I have a new problem and was wondering if you could help me out. https://community.cloudera.com/t5/Support-Questions/Process-Stuck-in-Hadoop-Cluster/td-p/305553 I'm trying to run a process and the yarn.nodemanager log get stuck in the following lines: 2020-11-07 04:19:34,342 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app node started at 8042 2020-11-07 04:19:34,347 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /138.68.238.32:8031 2020-11-07 04:19:34,368 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out 0 NM container statuses: [] 2020-11-07 04:19:34,373 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registering with RM using containers :[] 2020-11-07 04:19:34,520 INFO org.apache.hadoop.yarn.server.nodemanager.security.NMContainerTokenSecretManager: Rolling master-key for container-tokens, got key with id 1152592273 2020-11-07 04:19:34,523 INFO org.apache.hadoop.yarn.server.nodemanager.security.NMTokenSecretManagerInNM: Rolling master-key for container-tokens, got key with id -1064351767 2020-11-07 04:19:34,524 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registered with ResourceManager as slave01:44367 with total resource of <memory:28672, vCores:6> 2020-11-07 04:19:34,524 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Notifying ContainerManager to unblock new container-requests

banshidhar_saho · ‎11-05-2020

Hi Team, Thank you very much for promptly responding and providing useful solution/direction. While testing, app team reported that child dir/files are having r-x (for dir) and r--(files) for Others. Looks like ACL are not getting inherited to child dir and files. Can you confirm if ACL should inherit to child dir and files or not? Is there any other configuration that I need to check and modify? Regards, Banshi.

Hadoop_Admin · ‎11-04-2020

Check the usage of the cluster when you submit the Job in YARN. Share the log if you can

Shelton · ‎11-03-2020

@NCBank Can you start a new thread and tag me. Please include your logs or error message. The Thread you are updating is OLD.

Online	Offline
Last Visited	‎12-11-2025 11:50 PM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎12-11-2025 11:50 PM
Posts	3,679
Kudos received	627

Cloudera Community

Re: Apache nifi memory consumption in kubernetes

Re: Nifi toolkit command for GitLabFlowRegistry

Re: Not able to delete the NiFi existing flow usin...

Re: Securing Nifi with SSL and using OIDC provider...

Re: External zookeeper and nifi cluster connection...

Re: How does HDFS checkpointing work in a HA clust...

Re: NODEMANAGERs are going to unknown state

Re: Data Node Pause Duration

Re: How to See which NameNode is Active?

Re: Changing HDFS replication factor on existing f...

Re: Process Stuck in Hadoop Cluster

Re: Hadoop not running tasks

Re: HDFS file/dir ownership to be <user>:<correspo...

Re: YARN job is working slow, what could be the re...

Re: Kerberos KDC no working