About kramalingam

getschwifty · ‎05-13-2020

Thanks @kramalingam !

Hchlee · ‎02-19-2020

you saved my day!

kramalingam · ‎03-19-2018

This article provides an overview of various aspects of Monitoring Hive LLAP key metrics like Hive LLAP Configurations, YARN Queue setup, YARN containers, LLAP cache hit ratio, executors, IO elevator metrics, JVM Heap usage and non-heap usage… etc Execution Engine LLAP is not an execution engine (like MapReduce or Tez). Overall execution is scheduled and monitored by an existing Hive execution engine (such as Tez) transparently over both LLAP nodes, as well as regular containers. Obviously, LLAP level of support depends on each individual execution engine (starting with Tez). MapReduce support is not planned, but other engines may be added later. Other frameworks like Pig and Spark also have the choice of using LLAP daemons. Enabling LLAP, Setting up Memory per daemon, In-Memory cache per Daemon, Number of Node(s) for running Hive LLAP daemon (num_llap_nodes_for_llap_daemons) and Number of executors per LLAP daemon in Advanced Hive Interactive-site: Cache Basics The daemon caches metadata for input files, as well as the data. The metadata and index information can be cached even for data that is not currently cached. Metadata is stored in the process in Java objects; cached data is stored and kept off-heap. Eviction policy is tuned for analytical workloads with frequent (partial) table-scans. Initially, a simple policy like LRFU is used. The policy is pluggable. Caching granularity. Column-chunks are the unit of data in the cache. This achieves a compromise between low-overhead processing and storage efficiency. The granularity of the chunks depends on the particular file format and execution engine (Vectorized Row Batch size, ORC stripe, etc.). A bloom filter is automatically created to provide Dynamic Runtime Filtering. Resource Management YARN remains responsible for the management and allocation of resources. The YARN container delegation model is used to allow the transfer of allocated resources to LLAP. To avoid the limitations of JVM memory settings, cached data is kept off-heap, as well as large buffers for processing (e.g., group by, joins). This way, the daemon can use a small amount of memory, and additional resources (i.e., CPU and memory) will be assigned based on workload. LLAP Yarn Queue It is important to know how different parameters in YARN queue configurations effects in LLAP performance. yarn.scheduler.capacity.root.llap.capacity=60 yarn.scheduler.capacity.root.llap.maximum-capacity=60 yarn.scheduler.capacity.root.llap.minimum-user-limit-percent=100 yarn.scheduler.capacity.root.llap.ordering-policy=fifo yarn.scheduler.capacity.root.llap.priority=1 yarn.scheduler.capacity.root.llap.state=RUNNING yarn.scheduler.capacity.root.llap.user-limit-factor=1 Resource Manager UI Please refer to the original article for different Grafana dashboards: http://www.kartikramalingam.com/hive-llap/

abhishek_sakhuj · ‎02-06-2019

@Kartik Ramalingam Thank you for your wonderful and helpful post! Ranger authorization is still incorrect in the post. Ranger authorization is already enabled in initial topology. However, Step 7 must describe it to disable Ranger authorization by modifying the parameter value from XASecurePDPKnox to AclsAuthz. Also, Step 7 example needs to be corrected. Regards, Sakhuja

Online	Offline
Last Visited	‎09-16-2025 12:00 AM

Member Since	‎06-05-2017 12:44 PM
Last Visited	‎09-16-2025 12:00 AM
Posts	18
Kudos received	7

Cloudera Community

Re: Why is there no option to create hdfs working ...

Re: Connecting to Kerberos secured HBase cluster f...

Re: Kafka console consumer nor reading messages wh...

Monitoring Hive LLAP

Re: How to configure and troubleshoot a Knox topol...