Member since
03-14-2022
2
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1747 | 08-22-2022 02:35 PM |
08-22-2022
02:40 PM
Can you provide an example with a single field and a reproducible error case?
... View more
08-22-2022
02:35 PM
Hi, I have some points to that questions: The order of the parameter does not matter If you do not persist the settings in the configuration, you have to apply them at the start of each session Those parameters are not the holy grail. Vectorized execution can lead to errors and wrong results under specific circumstances and should only be used if it is required and known to work with the used UDFs. Using CBO/fetching stats can improve your performance. Under the wrong circumstances - it can lead to a long gathering period for stats at the end of a query that maybe is worse than the performance gain. Auto convert join should only be used if you know the sizes of the tables. Setting this property to true will trigger mapjoin only if one table fits in your memory otherwise there will be no use in setting this to true and you will not find any optimization in your runtime. I really can recommend you that article by a fellow Clouderan: https://community.cloudera.com/t5/Community-Articles/Hive-on-Tez-Performance-Tuning-Determining-Reducer-Counts/ta-p/245680 If you have concrete questions to optimize a specific query do not hesitate to ask.
... View more