About Shelton

Shelton · ‎06-22-2018

@Dassi Jean Fongang A datanode is a slave component of NameNode that shouldn't sit on the same host as the standby Namenode which is a Master process. With over 24 nodes you can afford to have at least 3 master nodes (Active & Standby NameNode ,Active & Standby RM and 3 zookepers) Can you ensure the client components ain't running? Get Datanode Info $ hdfs dfsadmin -getDatanodeInfo datanodex:50020 Trigger a block report for the given datanode $ hdfs dfsadmin -triggerBlockReport datanodex:50020 Can you update exclude file on both machine Resource manager and Namenode (Master Node), if it’s not there then we can create an exclude file on both the machines $ vi excludes Add the Datanode/Slave-node address, for decommissioning e.g 10.125.10.107 Run the following command in the Resource Manager: $ yarn rmadmin -refreshNodes This command will basically check the yarn-site.xml and process that property.and decommission the mentioned node from yarn. It means now yarn manager will not give any job to this node. But the tricky part is even if all the datanodes components are not running deleting the datanode would impact your standby NameNode. I recommend you move the standby Namenode to another host that should ONLY have a DataNode and Node Manager or at most client software like ZK,Kerberos etc HTH

Shelton · ‎06-22-2018

@tbroas I have raised a ticket (issue) but there is no response request (14748) for a printable certificate and request #9859 "HDPCA" but to date I have no response can you help Thanks a million

Shelton · ‎06-22-2018

@JAy PaTel Can you share your sqoop code,please remember to obfuscate sensitive info?

Shelton · ‎06-21-2018

@Chiranjeevi Nimmala During the re-install did you drop and recreate a fresh ranger database? If you created a new DB then Stop ranger then navigate to Ambari UI--->Ranger--->Configs--->Ranger Admin Ensure the following match your new DB/Host Ranger DB name Ranger DB username Ranger DB host Ranger DB password Driver class name for a JDBC Ranger database JDBC connect string for a Ranger database Ranger maintains policy related data in x_policy * tables ! to see which policies are enabled run the below SQL if you backend DB is Mysql else choose the appropriate syntax as the ranger DB owner mysql> select name,is_enabled from x_policy; Here is a link to manage the policies using REST APIs for Service Definition, Service and Policy Management HTH

Shelton · ‎06-19-2018

@Vishal Gupta I don't think Cloudera Manager does install the KDC and the client automatically. Whether you are using Cloudera or Hortonworks you will first need to have a working KDC server(krb5-server) and Kerberos clients (krb5-workstation, krb5-libs) and realm setup in both cases you will either use the CM or Ambari Kerberos wizard. Having said that I provided a walkthrough to help you set up Kerberos on HDP, and consequent questions should be opened as new threads this revived quest dates from August 2017! 🙂 It will be great if you can marked this HCC thread as Answered by clicking on the "Accept" Button. That way other HCC users can quickly find the solution when they encounter the same issue. HTH

Shelton · ‎06-19-2018

@Ilia K The default Scheduler in HDP is Capacity Scheduler. You should note the differences between all the 3 settings Capacity Scheduler The CapacityScheduler is designed to run Hadoop applications as a shared, multi-tenant cluster in an operator-friendly manner while maximizing the throughput and the utilization of the cluster. FIFO (First in First Out) Is the simplest scheduling algorithm. FIFO simply queues processes in the order that they arrive in the ready queue. Fair Scheduler Fair scheduling is a method of assigning resources to applications such that all apps get, on average, an equal share of resources over time Having said that in your example above the PROD has 70% so despite "Each queue is used by multiple users and resources should be shared equally" the PROD queue jobs will have priority over the DEV queue which I think is the desired config. Can you share the below values: yarn.resourcemanager.scheduler.class Capacity Scheduler values? Please revert

Shelton · ‎06-13-2018

@olivier brobecker Have you tried using the below snippet? Look through the output for missing or corrupt blocks (ignore under-replicated blocks for now) $ hdfs fsck / | egrep -v '^\.+$' | grep -v replica Once you find a file that is corrupt $ hdfs fsck /path/to/corrupt/file -locations -blocks -files

Shelton · ‎06-13-2018

@salma zegdene I see your problem was resolved if so can mark this HCC thread as Answered by clicking on the "Accept" Button. That way other HCC users can quickly find the solution when they encounter the same issue.

Shelton · ‎06-12-2018

@Michael Bronson I think this should be linked to the Namenode issues, talking about that I have failed to reproduce the scenario but still investigating

Shelton · ‎06-12-2018

@Michael Bronson The document I referenced should give you the steps to follow like analyzing the offending application etc if its the same cluster then it could be the Namenode: netstat -nape | awk ‘{if($5 ==“IR_of_amster01:2181”)print $4, $9;}’ Do the same for master03, then maybe use a bash script to kill the dead processes

Online	Offline
Last Visited	‎12-11-2025 11:50 PM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎12-11-2025 11:50 PM
Posts	3,679
Kudos received	627

Cloudera Community

Re: Apache nifi memory consumption in kubernetes

Re: Nifi toolkit command for GitLabFlowRegistry

Re: Not able to delete the NiFi existing flow usin...

Re: Securing Nifi with SSL and using OIDC provider...

Re: External zookeeper and nifi cluster connection...

Re: DataNode not decommisioning

Re: HDPCD Exam Issues

Re: Closed : How to import mutilple tables (not ...

Re: Ranger UI is still showing the old policies af...

Re: Kerberos Setup on HDP 2.6

Re: Yarn Capacity Scheduler: Share resource betwee...

Re: hdfs dfsadmin -report VS hdfs fsck /

Re: NameNode can not be started

Re: zookeper - Too many connections

Re: zookeper - Too many connections