About mfoley

robi10101298 · ‎01-21-2021

Hello, can you please help me with a similar script for batch renaming Hadoop files? Thanks!

sedatkestepe · ‎03-02-2017

Thank you for the explanation @Matt Foley. 🙂

mfoley · ‎12-16-2016

Hi @Arsalan Siddiqi , yes, this is what I expected for Putty. By using the Putty configuration dialogue as shown in your screenshot, you ALREADY have an ssh connection to the guest VM, so typing `ssh` in the Putty terminal window is unnecessary and `scp` won't work -- instead of asking the host OS to talk to the guest OS, you would be asking the guest OS to talk to itself! I'm glad `pscp` (in a cmd window) worked for you -- and thanks for accepting the answer. Your last question, "why when i connect via Putty to root@127.0.0.1 p 2222 in windows and do a "dir" command, i see different folders as to when i do a DIR in virtualbox terminal??" is a puzzler. I would suggest, in each window (Putty terminal and Virtualbox terminal), you type the linux command `pwd`, which stands for "print working directory". It may be that the filesystem location you are ending up at login in the two tools is different. Also you can try the linux `ls` command.

ordeal1983 · ‎06-26-2016

Hi @Matt Foley: Thanks very much for your response. actually i have followed this article and both 3 points were yes for me. since i am not a fresh installation, after i checked my cluster again i found that the version is different between my master and the new host, not sure why it happened, i just upgraded my Ambari-server and ambari-agent, and re-deployed to the new host server, it works. Thanks very much.

rbalam · ‎04-20-2016

@Matt FoleyThanks for additional information. This is very helpful

Bahubali · ‎04-20-2016

Thank you all for the response.

jarnold · ‎08-27-2018

The article doesn't indicate this, so for reference, the listed HDFS settings do not exist by default. These settings, as shown below, need to go into hdfs-site.xml, which is done in Ambari by adding fields under "Custom hdfs-site". dfs.namenode.rpc-bind-host=0.0.0.0 dfs.namenode.servicerpc-bind-host=0.0.0.0 dfs.namenode.http-bind-host=0.0.0.0 dfs.namenode.https-bind-host=0.0.0.0 Additionally, I found that after making this change, both NameNodes under HA came up as stand-by; the article at https://community.hortonworks.com/articles/2307/adding-a-service-rpc-port-to-an-existing-ha-cluste.html got me the missing step of running a ZK format. I have not tested the steps below against a Production cluster and if you foolishly choose to follow these steps, you do so at a very large degree of risk (you could lose all of the data in your cluster). That said, this worked for me in a non-Prod environment: 01) Note the Active NameNode. 02) In Ambari, stop ALL services except for ZooKeeper. 03) In Ambari, make the indicated changes to HDFS. 04) Get to the command line on the Active NameNode (see Step 1 above). 05) At the command line you opened in Step 4, run: `sudo -u hdfs hdfs zkfc -formatZK` 06) Start the JournalNodes. 07) Start the zKFCs. 08) Start the NameNodes, which should come up as Active and Standby. If they don't, you're on your own (see the "high risk" caveat above). 09) Start the DataNodes. 10) Restart / Refresh any remaining HDFS components which have stale configs. 11) Start the remaining cluster services. It would be great if HWX could vet my procedure and update the article accordingly (hint, hint).

mfoley · ‎03-03-2016

Are you using a version of Ambari less than 2.2.1? Please refer to http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.1.0/bk_Installing_HDP_AMB/content/_determine_stack_compatibility.html , where it says only Ambari-2.2.1 supports HDP-2.4. Therefore, if you are using < 2.2.1, please upgrade Ambari before trying to upgrade the stack. The upgrade doc is at http://docs.hortonworks.com/HDPDocuments/Ambari/Ambari-2.2.1.0/index.html#bk_upgrading_Ambari

sinha_pranshu · ‎02-13-2016

e know Hadoop is used in clustered environment where we have clusters, each cluster will have multiple racks, each rack will have multiple datanodes. So to make HDFS fault tolerant in your cluster you need to consider following failures- DataNode failure Rack failure Chances of Cluster failure is fairly low so let not think about it. In the above cases you need to make sure that - If one DataNode fails, you can get the same data from another DataNode If the entire Rack fails, you can get the same data from another Rack So thats why I think default replication factor is set to 3, so that not 2 replica goes to same DataNode and at-least 1 replica goes to different Rack to fulfill the above mentioned Fault-Tolerant criteria. Hope this will help.

daniel_zafar · ‎09-05-2018

I have a 4-node cluster and this did not work for me. same error: /bucket_00003 could only be written to 0 of the 1 minReplication nodes. There are 4 datanode(s) running and no node(s) are excluded in this operation.

Online	Offline
Last Visited	‎08-14-2019 06:45 PM

Member Since	‎10-22-2015 04:49 PM
Last Visited	‎08-14-2019 06:45 PM
Posts	83
Kudos received	79

Cloudera Community

Re: how to check all component on master are stop ...

Re: Network Bonding on Hadoop Cluster with Centos ...

Re: With two HA clusters configured for cross-clus...

Re: file location in HDP

Re: Is it possible to change the default "home" di...

Re: Can I control naming patterns for HDFS chunks

Re: Network Bonding on Hadoop Cluster with Centos ...

Re: file location in HDP

Re: local repo install issue for HDP2.4

Re: hadoop nodes from SUSE to RHEL

Re: Write performance in HDFS

Re: Parameters for Multi-Homing

Re: Cant find HDP version 2.4 when want to upgrade...

Re: HDFS Data Durability and Availability with rep...

Re: Write or Append failures in very small Cluster...