- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Adding nodes will improve performance ?
- Labels:
-
Apache Hive
-
Apache Spark
Created on ‎02-21-2017 01:18 PM - edited ‎09-16-2022 04:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi.
If adding 3 nodes to my 3 nodes clusters would obviously increase performance by x2 at least ? or there is more parameters to consider to improve to x2 ?
Thanks
Created on ‎10-16-2017 07:01 AM - edited ‎10-16-2017 07:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We seem to have the same exact problem. We added 20 nodes to our existing cluster of 60 nodes which makes it 80 nodes. The new nodes are of the same configuration/capacity of the old ones. We do have heavy and concurrent jobs (Hive queries) that could easily flood the server 100%, this is to confirm that the cluster is not under-utilized. We did rebalance the data and verified that they are evenly balanced across the data nodes. We dont see any improvement at all after the upgrade, the job timings are same as before the upgrade.
Do we need to update stats, metastore or any ther configuration for the new nodes to take effect in terms of performance ?? Any insights on this is much appreciated.
Created ‎10-19-2017 07:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
there are couple of places that needsd tuining in the query level
1 . stats for the table is must for good performance
2. when user is joining two tables make sure there are using the large table in the last and the first table is smaller
3. you can also use HINTS to imporve query performance.
4. hive table's file format is big a factor
5. choosing when to use paritioning vs bucketing.
6.allocate good memory to hiveserver2 and metastore
7.heapsize
8 .load balancer on the host

- « Previous
-
- 1
- 2
- Next »