Created on 02-04-201604:48 AM - edited 08-17-201901:24 PM
Apache Tez has become a very important framework and API to support batch and interactive over terabytes and petabytes of data for many engines within HDP such as Pig, Hive, Java. Cascading and others, with performance advantages at scale over Map Reduce and even Spark at certain volumes of data.
This article is meant to outline the best practice in configuring and tuning Tez, and why you would set certain values in certain properties to get performance at scale, with step by step instructions.
Step 0 - If you are a Hortonworks Support Subscription Customer, begin to utilize the SamrtSense tool. HortonworksSmartSense is a cluster diagnostic and recommendation tool that is critical for efficient support case resolution, pre-emptive issue detection and performance tuning. Your recommended Tez configurations would be provided to you as a customer. This is the value Hortonworks brings.
Step 1 - Determine your YARN Node manager Resource Memory (yarn.nodemanager.resource.memory-mb) and your YARN minimum container size (yarn.scheduler.minimum-allocation-mb). Your yarn.scheduler.maximum-allocation-mbis the same asyarn.nodemanager.resource.memory-mb.
yarn.nodemanager.resource.memory-mb is the Total memory of RAM allocated for all the nodes of the cluster for YARN. Based on the number of containers, the minimum YARN memory allocation for a container is yarn.scheduler.minimum-allocation-mb. yarn.scheduler.minimum-allocation-mb will be a very important setting for our Tez Application Master and Container sizes.
So how do we determine this with just the number of cores, disks, and RAM on each node? The Hortonworks easy button approach. Follow the instructions at this link, Determine HDP Memory Config.
For example, if you are on HD Insight running a D12 node with 8 CPUs and 28GBs of memory, with no HBase, you run:
Run python yarn-utils.py -c 8 -m 28 -d 2 -k False
Your output would look like this.
In Ambari, configure the appropriate settings for YARN and MapReduce or in a non-Ambari managed cluster, manually add the first three settings in yarn-site.xml and the rest in mapred-site.xml on all nodes.
Step 2 - Determine your Tez Application Master and Container Size, that is tez.am.resource.memory.mb and hive.tez.container.size.
Set tez.am.resource.memory.mb to be the same as yarn.scheduler.minimum-allocation-mb the YARN minimum container size.
Set hive.tez.container.size to be the same as or a small multiple (1 or 2 times that) of YARN container size yarn.scheduler.minimum-allocation-mb but NEVER more than yarn.scheduler.maximum-allocation-mb. You want to have headroom for multiple containers to be spun up.
A general guidance: Don't exceed Memory per processors as you want one processor per container. So if you have for example, 256GB and 16 cores, you don't want to have your container bigger than 16GB.
Container Reuse set to True: tez.am.container.reuse.enabled (Default is true)
Prewarm Containers when HiveSever2 Starts, under Hive Configurations in Ambari.
Step 3 - Application Master and Container Java Heap sizes (tez.am.launch.cmd-opts and hive.tez.java.ops respectively)
By default these are BOTH 80% of the container sizes, tez.am.resource.memory.mb and hive.tez.container.sizerespectfully.
NOTE: tez.am.launch.cmd-opts is automatically set, so no need to change this.
In HDP 2.3 and above, no need to also set hive.tez.java.ops as it can be automatically set controlled by a new property tez.container.max.java.heap.fraction which is defaulted to 0.8 in tez-site.xml. This property is not by default in Ambari. If you wish you can add it to the Custom tez-site.sml.
As you can see from Ambari, in Hive -> Advance configurations, there are no manual memory configurations set for hive.tez.java.opts
if you wish to make the heap 75% of the container, then set the Tez Container Java Heap Fraction to 0.75
If you wish this set manually, you can add to hive.tez.java.ops for example -Xmx7500m -Xms 7500m, as longs as it is a fraction of hive.tez.container.size
SET tez.runtime.io.sort.mb to be 40% of hive.tez.container.size. You should rarely have more than2GB set.
By default hive.auto.convert.join.noconditionaltask = true
SET hive.auto.convert.join.noconditionaltask.size to 1/3 of hive.tez.container.size
SET tez.runtime.unordered.output.buffer.size-mb to 10% of hive.tez.container.size