Created on 02-21-2017 01:18 PM - edited 09-16-2022 04:07 AM
Hi.
If adding 3 nodes to my 3 nodes clusters would obviously increase performance by x2 at least ? or there is more parameters to consider to improve to x2 ?
Thanks
Created on 10-16-2017 07:01 AM - edited 10-16-2017 07:04 AM
We seem to have the same exact problem. We added 20 nodes to our existing cluster of 60 nodes which makes it 80 nodes. The new nodes are of the same configuration/capacity of the old ones. We do have heavy and concurrent jobs (Hive queries) that could easily flood the server 100%, this is to confirm that the cluster is not under-utilized. We did rebalance the data and verified that they are evenly balanced across the data nodes. We dont see any improvement at all after the upgrade, the job timings are same as before the upgrade.
Do we need to update stats, metastore or any ther configuration for the new nodes to take effect in terms of performance ?? Any insights on this is much appreciated.
Created 10-19-2017 07:15 PM
there are couple of places that needsd tuining in the query level
1 . stats for the table is must for good performance
2. when user is joining two tables make sure there are using the large table in the last and the first table is smaller
3. you can also use HINTS to imporve query performance.
4. hive table's file format is big a factor
5. choosing when to use paritioning vs bucketing.
6.allocate good memory to hiveserver2 and metastore
7.heapsize
8 .load balancer on the host