Member since
12-09-2015
6
Posts
13
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
244 | 12-20-2016 06:36 PM | |
765 | 09-13-2016 04:03 AM | |
607 | 09-08-2016 04:47 AM |
12-20-2016
06:36 PM
3 Kudos
Hi, we have an official document for deploy HDP on VMs: https://hortonworks.com/wp-content/uploads/2014/02/1514.Deploying-Hortonworks-Data-Platform-VMware-vSphere-0402161.pdf. It has reference link on HVE (NodeGroup) feature which include details you may want to know.
... View more
09-19-2016
01:08 AM
1 Kudo
Like @Rajkumar Singh mentioned above, the application state (like life cycle related events) get persistent in RM with RMStateStore. For ApplicationMaster (AM), there is no central place for AM to persistent its own internal state but it is flexible for AM to pickup places to store temporary results/progress so AM failed/restart won't have to lose all progress that last AM attempt gains. Take MapReduce for example, when configured properly, MR AM after restart will read finished map/reduce tasks from job history files on HDFS, so finished map/reduce tasks won't get re-executed after AM restart. Other application could have similar behavior in case it want to persist something.
... View more
09-19-2016
12:56 AM
Hi Arun, you can configure RM HA so 2nd RM (we called it standby RM) will automatically take effective when 1st RM (active RM) get down due to hardware/software issues. For how to deploy it, please refer: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_hadoop-ha/content/ha-rm-deploy-ra-cluster.html . For MapReduce JHS (Job HistoryServer), because all write path are in application AM (write to HDFS directly) and JHS only play as reader. Anytime you found it is down, just bring it back should be ok. ATS (application timeline server) is a bit complicated. while v1 based on levelDB (a local k-v store) and v1.5 based on it (with adding a HDFS cache layer) which still have problem of SPOF (Single Point of Failure). ATS v2 is still in development which has distributed write path and use HBase as backend store that can totally get rid of SPOF issues. I would suggest to keep using ATS v1.5 and upgrade to v2 in future when it get ready.
... View more
09-13-2016
04:03 AM
2 Kudos
Hi Arun, FairScheduler is not HDP recommended/supported resource scheduler, so we don't have document to cover it. Please refer apache one: https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/FairScheduler.html. I would like to add more background here: preemption feature is firstly added into CapacityScheduler which is quit mature and production ready. For FairScheduler, I am not exactly sure status (alpha or GA) but just notice several fixes are going on in community: https://issues.apache.org/jira/browse/YARN-4752 Any special reason to use FairScheduler? If not, you can also try preemption for CapacityScheduler. Here is doc: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_yarn_resource_mgt/content/preemption.html
... View more
09-08-2016
04:47 AM
7 Kudos
This configuration is involved since MR v1. It serves as an up limit for DN locations of job split which intend to protect the JobTracker from overloaded by jobs with huge numbers of split locations. For YARN in Hadoop 2, this concern is lessened as we have per job AM instead of JT. However, it will still impact RM as RM will potentially see heavy request from the AM which tries to obtain many localities for the split. With hitting this limit, it will truncate location number to given limit with sacrifice a bit data locality but get rid of the risk to hit bottleneck of RM. Depends on your job's priority (I believer it is a per job configuration now), you can leave it as a default (for lower or normal priority job) or increase to a larger number. Increase this value to larger than DN number will be the same impact as set it to DN's number.
... View more
12-09-2015
06:02 PM
ShuffleHandler is part of NM auxiliary service so it will use NM daemon's memory rather than container's. So increase container's heap size may not be helpful for your case. You should increase NM's daemon size instead as suggested by @Neeraj Sabharwal. @Hajime
... View more