Member since
07-30-2019
111
Posts
186
Kudos Received
35
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3448 | 02-07-2018 07:12 PM | |
2662 | 10-27-2017 06:16 PM | |
2895 | 10-13-2017 10:30 PM | |
5244 | 10-12-2017 10:09 PM | |
1357 | 06-29-2017 10:19 PM |
08-17-2016
06:11 PM
Hi @William Bolton, are these applications accessing HDFS directly? What's the mode of access e.g. WebHDFS REST API, Java APIs or something else?
... View more
08-17-2016
06:01 PM
Hi @jovan karamacoski, are you able to share what your overall goal is? The NameNode detects DataNode failures in ~10 minutes and queues re-replication work. Disk failures can take longer and we are planning to make improvements in this area soon. The re-replication logic is complex. If you think your changes will be broadly useful please consider filing a bug in Apache HDFS Jira and submitting the changes as a patch. Best, Arpit.
... View more
08-16-2016
11:58 PM
3 Kudos
Commenting to clarify that some of the advice above is not wrong but it can be dangerous. Starting with HDP 2.2 and later, the DataNode is more strict about where it expects block files to be. I do not recommend moving block files or folders on DataNodes around manually, unless you really know what you are doing. @jovan karamacoski, to answer your original question - the NameNode drives the re-replication (specifically the BlockManager class within the NameNode). The ReplicationMonitor thread wakes up periodically and computes re-replication work for DataNodes. The re-replication logic has multiple triggers like block reports, heartbeat timeouts, decommission etc.
... View more
08-08-2016
07:43 PM
Thanks for the heads up @Kuldeep Kulkarni. It could be a couple of things: The ephemeral port was in use by another process that is now gone. There is a process using the port but it is running with different user credentials. @vijay kadel were you running the ps/netstat/lsof commands as the root user?
... View more
08-04-2016
09:53 PM
1 Kudo
The question is unclear to me but I recommend reading the following three blog posts carefully as they go into great detail about balancer basics, configuration and best practices: https://community.hortonworks.com/articles/43615/hdfs-balancer-1-100x-performance-improvement.html https://community.hortonworks.com/articles/43849/hdfs-balancer-2-configurations-cli-options.html https://community.hortonworks.com/articles/44148/hdfs-balancer-3-cluster-balancing-algorithm.html
... View more
08-04-2016
09:47 PM
2 Kudos
Hi @ripunjay godhani, we no longer recommend setting up NameNode HA with NFS. Instead please use the Quorum Journal Manager setup. The Apache HA with QJM documentation is a good start: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html NameNode image files will be stored on two nodes (active and standby NN) in this setup. The latest edit logs will be on the active NameNode and at least two journal nodes (usually all three, unless one Journal Node has an extended downtime). The NameNodes can optionally be configured to write their edit logs to separate NFS shares if you really want but it is not necessary. You don't need RAID 10. HDFS HA with QJM provides good durability and availability with commodity hardware.
... View more
08-04-2016
09:33 PM
1 Kudo
@sgowda, thanks for confirming you just want to mount volumes at a new location. If you are just remounting then your existing HDFS metadata and data files will be present but under new Linux paths. In that case decommissioning is not necessary. You just need to to update NameNode and DataNode configuration settings like dfs.namenode.name.dir, dfs.datanode.data.dir to point to the new locations. See this link for a full list of settings, not all may apply to you. Don't reformat the NN else you will lose all your data. The simplest approach is:
Take a full cluster downtime and bring down all HDFS services. Remount volumes at the new location on all affected nodes. Update NN and DN configurations via Ambari to point to the new storage roots. Restart services. If you are not familiar with these settings I recommend learning more about HDFS first since its easy to lose data via administrative mistakes.
... View more
08-03-2016
11:13 PM
Are you just mounting volumes at a new location?
... View more
08-01-2016
11:33 PM
3 Kudos
Hi @Facundo Bianco, you are using a privileged port number (1004) for data transfer so you cannot enable SASL. Please check your hdfs-site.xml to ensure SASL is not enabled via dfs.data.transfer.protection. The Secure DataNode section from the Apache HDFS documentation describes this. https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Secure_DataNode Since you are using HDP with Ambari, I recommend using the Ambari Kerberos Wizard especially if you are setting it up for the first time. At the very least it will provide you with a working reference configuration. The Ambari Kerberos Wizard is documented here: https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_Ambari_Security_Guide/content/_running_the_kerberos_wizard.html
... View more
08-01-2016
08:16 PM
4 Kudos
If you have setup automatic failover with ZooKeeper Failover Controllers then the ZKFC processes will automatically transition the Standby NN to Active status if the current active is unresponsive. The decision about which NN should be made active is taken by the ZKFC instances (coordinating via ZooKeeper). Ambari does not decide which NN should be active. If you wish to perform a manual failover then you can use the hdfs dfsadmin command as @Sagar Shimpi suggested. Both alternatives are described in the HDP documentation: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_hadoop-ha/content/ha-nn-deploy-nn-cluster.html https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_hadoop-ha/content/nn-ha-auto-failover.html If you want to better understand the internals of automatic NN failover (recommended if you are administering a Hadoop cluster with HA), I recommend reading the Apache docs, specifically the section on Automatic Failover. https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
... View more