Member since
11-29-2016
17
Posts
2
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1076 | 12-13-2016 09:36 PM | |
612 | 11-30-2016 03:37 PM |
07-27-2017
12:04 PM
You are correct in determining that your compute is constrained and HDFS is not. Before scaling out, you can try to do the following: Optimize your jobs/queries. If you are running hive queries there is probably large potential to optimize your queries. Tez configurations may need optimizing as well. (See links below) Reconfigure YARN queues to prioritize user jobs over other jobs (e.g. batch ETL) by allowing users queues to preempt the other queues. If these do not prevent YARN memory saturation (first bullet) or speed user jobs/queries (second bullet) then you will need to scale out by adding more data nodes. You should also be doing capacity planning. If you project your cluster usage will increase steadily (more jobs, more concurrent users) then optimizing as above likely is only buying you some time before the increased usage brings you to the same state. Note also that if you project an increase of data stored on the cluster then your HDFS utilization will climb steadily from the current 60%. It is a good practice to not let it exceed 80%, since disc space is also needed for writing intermediate results during jobs. If you are on bare metal, you will need some lead time to procure and rack-stack your data nodes, so you will need to plan to scale out well before HDFS capacity hits 80% or cluster usage increases significantly. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.5/bk_hive-performance-tuning/bk_hive-performance-tuning.pdf https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_yarn-resource-management/bk_yarn-resource-management.pdf
... View more
01-25-2017
06:55 PM
@Dezka Dex Thanks for the update. i think that is OS restriction, you can go through https://www.staldal.nu/tech/2007/10/31/why-can-only-root-listen-to-ports-below-1024/
... View more
04-12-2018
08:08 AM
@Rakesh Kumar The thread you are referring too was closed. I doubt whether members attend to old threads I advise you to open a new thread and possibly attach the logs at times errors differ. Please do that !!
... View more
05-04-2017
01:34 PM
I had this issue with 2.6 to. thanks ! Just that other people can find it this was my error: Execution of '/usr/bin/zypper --quiet install --auto-agree-with-licenses --no-confirm hadoop_2_5_0_2_3-yarn' returned 104. No provider of 'hadoop_2_5_0_2_3-yarn' found.
... View more
11-30-2016
03:37 PM
I figured this out. It is because of my network layout. The HDP machines are all binding to the 172 addresses and listening on that. When Ambari goes to connect to the name node on port 50070 for instance, it can't because the name node is listening on 50070 on 172 network which Ambari isn't on. I resolved the issue by setting up multi-homing: https://community.hortonworks.com/articles/24277/parameters-for-multi-homing.html
... View more