About KuldeepK

KuldeepK · ‎04-13-2016

Adi Jabkowsky Balancer works on maintaining threshold percentage, by default its 10% so it doesn't matter if new nodes are of higher capacity then older. 1. Doesn't this division of data misses the parallel computing advantage ? (most of the data is centralized in few DNs) If your cluster is un-balanced then yes it will affect your MapReduce job performance because of jobs getting scheduled on the Datanodes with larger dataset. 2. In case a new DN is down for any reason the recovery process (fixing under replicated blocks) would take longer, no ? Yes thats correct 3. Wouldn't it be smart to stop the balancer when data is spread evenly in size and not in percentage ? You can stop balancer at any time, it's safe to stop it by pressing ctrl+c command.

KuldeepK · ‎04-13-2016

@Arti Wadhwani

KuldeepK · ‎04-13-2016

@nbalaji-elangovan Looks like ansible playbook is failing because it could not find "/Users/nbalaji-elangovan/Downloads/incubator-metron-Metron_0.1BETA_rc7/metron-streaming/Metron-Pcap_Service/target/Metron-Pcap_Service-0.1BETA-jar-with-dependencies.jar" Can you please double check if this jar file is there with the required permissions?

KuldeepK · ‎04-12-2016

What is the Role of Journal nodes in Namenode HA ? I know many of us are aware that Role of Journal nodes is to keep both the Namenodes in sync and avoid hdfs split brain scenario by allowing only Active NN to write into journals. Have you ever wonder how does it works? Here you go! Journal nodes are distributed system to store edits. Active Namenode as a client writes edits to journal nodes and commit only when its replicated to all the journal nodes in a distributed system. Standby NN need to read data from edits to be in sync with Active one. It can read from any of the replica stored on journal nodes. ZKFC will make sure that only one Namenode should be active at a time. However, when a failover occurs, it is still possible that the previous Active NameNode could serve read requests to clients, which may be out of date until that NameNode shuts down when trying to write to the JournalNodes. For this reason, we should configure fencing methods even when using the Quorum Journal Manager. How quorum journal manager work with fencing ? To work with fencing journal manager uses epoc numbers. Epoc numbers are integer which always gets increased and have unique value once assigned. Namenode generate epoc number using simple algorithm and uses it while sending RPC requests to the QJM. When you configure Namenode HA, the first Active Namenode will get epoc value 1. In case of failover or restart, epoc number will get increased. The Namenode with higher epoc number is considered as newer than any Namenode with earlier epoc number. Now both Namenode thinks that they are active and sends write request to quorum journal manager with their epoc number, how QJM handles this situation? Quorum journal manager stores epoc number locally which called as promised epoc. Whenever JournalNode receives RPC request along with epoc number from Namenode, it compares the epoch number with promised epoch. If request is coming from newer node which means epoc number is greater than promised epoc then itrecords new epoc number as promised epoc. If the request is coming from Namenode with older epoc number, then QJM simply rejects the request. When QJM rejects the requests from Namenode with older epoc value then you get below lines in the Namenode logs WARN client.QuorumJournalManager (IPCLoggerChannel.java:call(388)) - Remote journal <journal-node-hostname>:<port> failed to write txns 2397121201-2397121201. Will try to write to this JN again after the next log roll. org.apache.hadoop.ipc.RemoteException(java.io.IOException): IPC's epoch 112 is less than the last promised epoch 113

KuldeepK · ‎04-12-2016

@Daniel Perry - I can think of easiest solution, if you can save output of hive action to some file then you can pass that file to shell script as argument, write a login in your shell script to take $1 as input file and do whatever you want. Does this makes sense?

KuldeepK · ‎04-12-2016

@Sagar Shimpi

KuldeepK · ‎04-12-2016

@Roberto Sancho - Looks like some problem with hdp-select installation, could you please try: Check if hdp-select is installed properly or not ( on the machine where you are trying to install HBase ) rpm -qa|grep hdp-select If hdp-select is already installed then please make sure that running hdp-select gives correct version hdp-select If above command gives wrong version then there is a problem with your upgrade.

KuldeepK · ‎04-12-2016

@banuradha ganapathiappan - Download VDD Kit from here and run below command to fix corrupt vmdk image vmware-vdiskmanager -R <Path to vmdk file>

KuldeepK · ‎04-12-2016

@Tushar Bodhale - I think order really matters here. you need to put above line exactly below your script location, please refer my working workflow.xml file. Do let me know if you have any further questions.

KuldeepK · ‎04-09-2016

@Inam Ur Rehman - You wont be able to run latest version of sandbox with 4G RAM. You can try older version though

Online	Offline
Last Visited	‎04-07-2022 05:11 PM

Member Since	‎04-03-2019 04:03 PM
Last Visited	‎04-07-2022 05:11 PM
Posts	962
Kudos received	1733

Cloudera Community

Re: oozie shell action

Re: Oozie Service Check fails after upgrading to ...

Re: oozie - mr container fails to start on rhel6 n...

Re: Not able to run docker container on yarn even ...

Re: Oozie Pig action doesn't appear in Tez UI

Re: When should i stop the balancer ?

Re: cluster h2o hadoop yarn memory

Re: Metron Ansible deployment failure in metron_pc...

How QJM Works in Namenode HA

Re: Capture output from Hive action and use that a...

Re: How to get rid of stale alerts in Ambari

Re: error install Hbase

Re: VD: error VERR_VD_VMDK_INVALID_HEADER opening ...

Re: Permission denied: user=yarn, access=WRITE ooz...

Re: Hello..I am installing Hortonworks sandbox on ...