About amcbarnett

amcbarnett · ‎04-22-2016

My question is how can we specify partitioning by month on a date field ? If our case, I don’t think partitioning on dates makes sense as we will have approximately 100,000 partitions and related files for this one table alone. Also, is there a severe performance penalty if there is a high number of partitions ? Does Hortonworks have a manual that goes over Hive table design tradeoffs, implications and limitations ?

amcbarnett · ‎04-06-2016

Don't use SSL or SSL=true with Kerberos. SSL and Kerberos is not compatible. Kerberos uses SASL hive.server2.thrift.sasl.qop in hive-site.xml has to be set to one of the valid QOP values ('auth', 'auth-int' or 'auth-conf'). https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-Integrity/ConfidentialityProtection Then use as for example: jdbc:hive://hostname/dbname;sasl.qop=auth-int|auth|auth-conf jdbc:hive2://sandbox.hortonworks.com:10001/default;principal=hive/sandbox.hortonworks.com@HORTONWORKS.COM?transportMode=http;httpPath=cliservice;auth=kerberos;sasl.qop=auth-int See https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_Security_Guide/content/ch_wire-connect.html

amcbarnett · ‎04-04-2016

Please take a look at these articles: https://community.hortonworks.com/content/kbentry/22419/hive-on-tez-performance-tuning-determining-reducer.html and https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html

amcbarnett · ‎03-11-2016

How Does Tez determine the number of reducers? How can I control this for performance? In this article, I will attempt to answer this while executing and tuning an actual query to illustrate the concepts. Then I will provide a summary with a full explanation. if you wish, you can advance ahead to the summary. ------------- 0. Prep Work and Checklist We followed the Tez Memory Tuning steps as outlined in https://community.hortonworks.com/content/kbentry/14309/demystify-tez-tuning-step-by-step.html We setup our environment, turning CBO and Vectorization On. set hive.support.sql11.reserved.keywords=false; set hive.execution.engine=tez; set hive.cbo.enable=true; set hive.compute.query.using.stats=true; set hive.stats.fetch.column.stats=true; set hive.stats.fetch.partition.stats=true; set hive.vectorized.execution.enabled=true; set hive.vectorized.execution.reduce.enabled = true; set hive.vectorized.execution.reduce.groupby.enabled = true; set hive.exec.parallel=true; set hive.exec.parallel.thread.number=16; We create Orc tables and did an Insert Overwrite into Table with Partitions set hive.exec.dynamic.partition.mode=nonstrict; #There is a danger with many partition columns to generate many broken files in ORC. To prevent that > set hive.optimize.sort.dynamic.partition=true; #if hive jobs previously ran much faster than in the current released version, look into potentially setting property > hive.optimize.sort.dynamic.partition = false . > insert overwrite table benchmark_rawlogs_orc partition (p_silo,p_day,p_clienthash) select * FROM <original table>; We generated the statistics we needed for use in the Query Execution -- // generate statistics for the ORC table set hive.stats.autogather=true; -- // To Generate Statistics for Entire Table and Columns for All Days (Longer) ANALYZE TABLE rawlogs.benchmark_rawlogs_orc partition (p_silo, p_day, p_clienthash) COMPUTE STATISTICS; ANALYZE TABLE rawlogs.benchmark_rawlogs_orc partition (p_silo, p_day, p_clienthash) COMPUTE STATISTICS for columns; -------------------------------- 1. First Execution of Query Here we can see 61 Mappers were created, which is determined by the group splits and if not grouped, most likely corresponding to number of files or split sizes in the Orc table. For a discussion on the number of mappers determined by Tez see How are Mappers Determined For a Query and How initial task parallelism works The mappers complete quickly but the the execution is stuck on 89% for a long time. We observe that there are three vertices in this run, one Mapper stage and two reducer stages. The first reducer stage ONLY has two reducers that have been running forever? hmmmm... Query finally completed in 60 secs. What gives? Why only 2 Reducers? Let's look at the Explain plan. ------------------------------------------------------- 2. The LONGGGGGG Explain Plan Let's look at the relevant portions of this explain plan. We see in Red that in the Reducers stage, 14.5 TB of data, across 13 million rows are processed. This is a lot of data to funnel through just two reducers. The final output of the reducers is just 190944 bytes (in yellow), after initial group bys of count, min and max. We need to increase the number of reducers. ------------------------------------------- 3. Set Tez Performance Tuning Parameters When Tez executes a query, it initially determines the number of reducers it needs and automatically adjusts as needed based on the number of bytes processed. - Manually set number of Reducers (not recommended) To manually set the number of reduces we can use parameter mapred.reduce.tasks. By default it is set to -1, which lets Tez automatically determine the number of reducers. However you are manually set it to the number of reducer tasks (not recommended) > set mapred.reduce.tasks = 38; It is better let Tez determine this and make the proper changes within its framework, instead of using the brute force method. > set mapred.reduce.tasks = -1; - How to Properly Set Number of Reducers First we double check if auto reducer parallelism is on. The parameter is hive.tez.auto.reducer.parallelism See https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.tez.auto.reducer.parallelism It is set to true. #Turn on Tez' auto reducer parallelism feature. When enabled, Hive will still estimate data sizes and set parallelism estimates. Tez will sample source vertices' output sizes and adjust the estimates at runtime as necessary. > set hive.tez.auto.reducer.parallelism; > set hive.tez.auto.reducer.parallelism = true; This is the first property that determines the initial number of reducers once Tez starts the query. Then, there are two boundary parameters hive.tex.min.partition.factor hive.tez.max.partition.factor #When auto reducer parallelism is enabled this factor will be used to put a lower limit to the number of reducers that Tez specifies. > hive.tez.min.partition.factor=0.25; #When auto reducer parallelism is enabled this factor will be used to over-partition data in shuffle edges. > hive.tez.max.partition.factor=2.0; More on this parameter later. The third property is hive.exec.reducers.max which determines the maximum number of reducers. By default it is 1099. The final parameter that determines the initial number of reducers is hive.exec.reducers.bytes.per.reducer By default hive.exec.reducers.bytes.per.reducer is set to 256MB, specifically 258998272 bytes. The FORMULA So to put it all together Hive/ Tez estimates number of reducers using the following formula and then schedules the Tez DAG. Max(1, Min(hive.exec.reducers.max [1099], ReducerStage estimate/hive.exec.reducers.bytes.per.reducer)) x hive.tez.max.partition.factor [2] ------------------ So in our example since the RS output is 190944 bytes, the number of reducers will be: > Max(1, Min(1099, 190944/258998272)) x 2 > Max (1, Min(1099, 0.00073)) x 2 = 1 x 2 = 2 Hence the 2 Reducers we initially observe. --------------------- 4. Increasing Number of Reducers, the Proper Way Let's set hive.exec.reducers.bytes.per.reducer to 10 MB about 10432 The new number of reducers count is > Max(1, Min(1099, 190944/10432)) x 2 > Max (1, Min(1099, 18.3)) x 2 = 19 (rounded up) x 2 = 38 Query takes 32.69 seconds now, an improvement. --------------------------------------------------- 5. More reducers does not always mean Better performance Let's set hive.exec.reducers.bytes.per.reducer to 15.5 MB about 15872 The new number of reducers count is > Max(1, Min(1099, 190944/15360)) x 2 > Max (1, Min(1099, 12)) x 2 = 12 x 2 = 24 Performance is BETTER with 24 reducers than with 38 reducers. ---------------------------- 7. Reducing number of Reducer Stages Since we have BOTH a Group By and an Order by in our query, looking at the explain plan, perhaps we can combine that into one reducer stage. The parameter for this is hive.optimize.reducededuplication.min.reducer which by default is 4. Setting this to 1, when we execute the query we get Performance is BETTER with ONE reducer stage at 15.88 s. NOTE: Because we also had a LIMIT 20 in the statement, this worked also. When LIMIT was removed, we have to resort to estimated the right number of reducers instead to get better performance. ------------------------------------------------ Summary While we can set manually the number of reducers mapred.reduce.tasks, this is NOT RECOMMENDED set mapred.reduce.tasks = 38; Tez does not actually have a reducer count when a job starts – it always has a maximum reducer count and that's the number you get to see in the initial execution, which is controlled by 4 parameters. The 4 parameters which control this in Hive are hive.tez.auto.reducer.parallelism=true; hive.tez.min.partition.factor=0.25; hive.tez.max.partition.factor=2.0; hive.exec.reducers.bytes.per.reducer=1073741824; // 1gb You can get wider or narrower distribution by messing with those last 3 parameterss (preferably only the min/max factors, which are merely guard rails to prevent bad guesses). Hive/ Tez estimates number of reducers using the following formula and then schedules the Tez DAG. Max(1, Min(hive.exec.reducers.max [1099], ReducerStage estimate/hive.exec.reducers.bytes.per.reducer)) x hive.tez.max.partition.factor [2] Then as map tasks finish, it inspects the output size counters for tasks to estimate the final output size then reduces that number to a lower number by combining adjacent reducers. The total # of mappers which have to finish, where it starts to decide and run reducers in the nest stage is determined by the following parameters. tez.shuffle-vertex-manager.min-src-fraction=0.25; tez.shuffle-vertex-manager.max-src-fraction=0.75; This indicates that the decision will be made between 25% of mappers finishing and 75% of mappers finishing, provided there's at least 1Gb of data being output (i.e if 25% of mappers don't send 1Gb of data, we will wait till at least 1Gb is sent out). Once a decision has been made once, it cannot be changed as some reducers will already be running & might lose state if we do that. You can get more & more accurate predictions by increasing the fractions. ------------------------------------ APPENDIX Hive-2.0 (only) improvements Now that we have a total # of reducers, but you might not have capacity to run all of them at the same time - so you need to pick a few to run first, the ideal situation would be to start off the reducers which have the most amount of data (already) to fetch first, so that they can start doing useful work instead of starting reducer #0 first (like MRv2) which may have very little data pending. tez.runtime.report.partition.stats=true; tez.runtime.pipelined-shuffle.enabled=true; The first flag there is pretty safe, but the second one is a bit more dangerous as it allows the reducers to fetch off tasks which haven't even finished (i.e mappers failing cause reducer failure, which is optimistically fast, but slower when there are failures – bad for consistent SLAs). Finally, we have the sort buffers which are usually tweaked & tuned to fit, but you can make it much faster by making those allocations lazy (i.e allocating 1800mb contigously on a 4Gb container will cause a 500-700ms gc pause, even if there are 100 rows to be processed). tez.runtime.pipelined.sorter.lazy-allocate.memory=true; Reference: https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties http://hortonworks.com/blog/apache-tez-dynamic-graph-reconfiguration/ http://www.slideshare.net/t3rmin4t0r/hivetez-a-performance-deep-dive and http://www.slideshare.net/ye.mikez/hive-tuning (Mandatory) See also http://www.slideshare.net/AltorosBY/altoros-practical-steps-to-improve-apache-hive-performance http://www.slideshare.net/t3rmin4t0r/data-organization-hive-meetup http://www.slideshare.net/InderajRajBains/using-apache-hive-with-high-performance Special thanks also to Gopal for assisting me with understanding this.

amcbarnett · ‎03-10-2016

@Gerd Koenig Can you see what is the port in /var/lib/ambari-server/resources/stacks/HDP/2.3/services/RANGER/configuration/ranger-admin-site.xml Was it changed by Ambari? Change it then and see if it works. The alerts in the following should pcik it up. vi /var/lib/ambari-server/resources/stacks/HDP/2.3/services/RANGER/configuration/alerts.json

amcbarnett · ‎03-10-2016

Microsoft Azure General Sizing Guidelines You need to size and price machine and storage separately. Use Linux VMS on Azure (not to be confused with the Ubuntu Beta offering on HDInsight) If performance is a must, especially with Kafka and Storm, use Premium storage not Standard. Make sure and request Premium Storage (see link below) Do not use A8 machines. Use either A10 or A11’s. A8 is backed by Infiniband which is more expensive and unnecessary for Hadoop Recommend D Series and the newer D_v2 Series for Solid State Drives if needed. For Premium Storage use DS_v2 Series It is recommended that Page Blob Storage is used for Hbase as opposed to Block Storage. See link below. Both options will need attached Blob Storage. The 382 GB local disk that comes with the VM is just for temp storage. For Blob Storage, it comes in 1023GB sizes. Each VM has a maximum number of Blob Storage that can be attached. Eg. A10 Vms can have a maximum of 16 * 1TB storage. See the following for more details: Use Availability sets for master and worker nodes Use one storage account for every node in the cluster in order to bypass IOPS limits for multiple VMs on the same Storage Account. You can also try to use Azure Data Lake Store (with adl://) in order to check the performance on the new Azure service. You also need to remember the maintenance windows of every Azure region according to your customers: some regions could be a good choice for new service availability (e.g.: US East 2) but not from a maintenance point of view (especially for European customers) --------------------------------------- Recommendation 1 - Best Compute performance for Batch and Real Time Use Cases For Head Master Nodes Use: Standard_D13_v2 (8 CPU, 56GB) or Standard_D5_v2 (16 CPU, 56 GB) OR Standard_D14_v2 (16 CPU, 112 GB) For Data Nodes Use: Standard_D14_v2 (16 CPU, 112 GB) or Standard_DS14_v2 (16 CPU, 112 GB with Premium Storage) or Standard_DS15_v2 (20 CPU, 140 GB with Premium Storage) If testing Kafka and Storm use Standard_DS13_v2, Standard_DS14_v2 or Standard_DS15_v2 with Premium Storage especially if performance is needed to meet SLAs Pros: CPU is 35% than D Series; Local SSD Disks; VMs cheaper per hour that A or D series. Recommendation 2 - Good Compute performance Use Standard_D13 (8 CPU, 56GB) or Standard_D14 (16 CPU, 112 GB) for Head/ Master nodes and Standard_D14 (16 CPU, 112 GB) for Data Nodes If testing Kafka and Storm use Standard_DS13 (8 CPU, 56GB) or Standard_DS14 (16 CPU, 112 GB) with Premium Storage especially if performance is needed to meet SLAs Pros: 60% faster than A series; Local SSD Disks; Why pick this if it is slightly more expensive per hour than D_v2 Series Recommendation 3 - Mostly for Batch performance Use A10 or A11 for Head/ Master nodes and A11 for Data Nodes Microsoft is pricing effectively so that you use the D-v2 Series ------------ Microsoft Links Storage Pricing: https://azure.microsoft.com/en-us/pricing/details/storage/ Premium Storage: https://azure.microsoft.com/en-us/documentation/articles/storage-premium-storage/ VM pricing: https://azure.microsoft.com/en-us/pricing/details/virtual-machines/#Linux VM Size Specs: https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-size-specs/ Page vs. Blob Storage: https://hadoop.apache.org/docs/current/hadoop-azure/index.html#Page_Blob_Support_and_Configuration Azure Data Lake: https://azure.microsoft.com/en-us/blog/introducing-azure-data-lake/

amcbarnett · ‎03-10-2016

@Sean Roberts Good idea. Let me convert into an article

amcbarnett · ‎03-09-2016

CREATE EXTERNAL TABLE avro_hive_table ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'TBLPROPERTIES ('avro.schema.url'='hdfs://localdomain/user/avro/schemas/activity.avsc') STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION '/user/avro/applog_avro';

amcbarnett · ‎03-09-2016

Thanks for directing me to the right path. Here was the solution Changed the zookeeper_client_jaas.conf AND zookeeper_client_jaas.conf.j2 template file instead which had useKeyTab=false Change it to the following: # vi /etc/zookeeper/conf/zookeeper_client_jaas.conf # vi /etc/zookeeper/conf/zookeeper_client_jaas.conf.j2 Client { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true useTicketCache=false keyTab="/etc/security/keytabs/zk.service.keytab" principal="zookeeper/<FQDN>@REALM"; }; However when you restart Ambari, it changes the zookeeper_client_jaas.conf back to the problem, so you HAVE to change to /etc/zookeeper/conf/zookeeper_client_jaas.conf.j2 template file ALSO for Ambari to get correct values.

amcbarnett · ‎03-09-2016

hive.server2.thrift.sasl.qop in hive-site.xml has to be set to one of the valid QOP values ('auth', 'auth-int' or 'auth-conf'). https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-Integrity/ConfidentialityProtection Then use as for example: jdbc:hive://hostname/dbname;sasl.qop=auth-int|auth|auth-conf jdbc:hive2://sandbox.hortonworks.com:10001/default;principal=hive/sandbox.hortonworks.com@HORTONWORKS.COM?transportMode=http;httpPath=cliservice;auth=kerberos;sasl.qop=auth-int See https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_Security_Guide/content/ch_wire-connect.html

Online	Offline
Last Visited	‎04-13-2018 03:07 PM

Member Since	‎09-29-2015 05:35 PM
Last Visited	‎04-13-2018 03:07 PM
Posts	286
Kudos received	595

Cloudera Community

Re: HIVE : counting null values based on group by

Re: ERROR 500 received - when installing the PIVOT...

Re: How do you achieve high availability in HDFS w...

Re: Why can't we use LDAP for Hadoop authenticatio...

Re: Error Installing HDB HAWQ Standby Master

Best Pratices for Hive Partitioning especially by ...

Re: Hiveserver2 SSL with Kerberos authentication

Re: Guidance for setting hive.exec.reducers.bytes....

Hive on Tez Performance Tuning - Determining Reduc...

Re: how to configure Ambari Alert for Ranger Admin

Recommendations for Microsoft Azure HDP Deployment

Re: Recommendations for Microsoft Azure HDP Deploy...

Re: Is there a way to create Hive table based on A...

Re: zkCli cannot execute in a Kerberos cluster - z...

Re: How to enable Hive SASL on kerberized cluster?