Cloudera Community

Community Articles

Find and share helpful community-sourced technical articles.

Advanced Search

knatarasan

Cloudera Employee

Following bullets provide, vantage point from which Applications on Hive has to be analyzed for performance

Application types

Ingestion intensive

Staging/Storage intensive

ETL intensive

Consumption intensive

Data Model used

Level of Normalization

Star schema

Table design

Storage format used

compression

Usage of collection data types (struct,array,map)

Partition

Is there a possibility of over partition

Whether dynamic partition enabled

Bucket (Review join conditions on bucketed column )

Functions

Usage of UDF,UDAF

Query pattern

Select with where ( map only)

Group by (map shuffle reduce)

Order by (map shuffle single reduce )

Analytical functions

Sort by (map shuffle multi reduce )

Join

Map join ( these are mapper only but would seek heavier memory )

Sort merge join

Partition column usage -( Especially For huge transaction tables )

From source table - Usage of multi pass

Table size : For huge tables analyze everything from scan perspective

711 Views

Announcements

Community Announcements

April 2025 Cloudera Customer Advisory: Cloudera’s response t...

What's New @ Cloudera

[RELEASED] Cloudera Streaming Analytics - Kubernetes Operato...

What's New @ Cloudera

[RELEASED] Cloudera Streams Messaging - Kubernetes Operator ...

Community Announcements

February 2025 Community Highlights

What's New @ Cloudera

3 Benefits of External IDE Connectivity, Now Available in Cl...

Top Kudoed Authors

User

Count

766

379

316

309

270

Cloudera Community

Community Articles

Elements of Hive Application Tuning

Apache Hive

Hive on Tez Performance Tuning - Determining Reduc...

Tuning Large Hive Queries - Part 1

Tuning Hbase for optimized performance ( Part 3 )

Tuning Hbase for optimized performance ( Part 2 )

Understanding Tez Application submission and its f...

Hive Performance Tuning Parameters

Apache Storm Topology Tuning Approach

Ambari Server Performance Tuning & Troubleshooting...

Extracting List Elements using JOLT

Tuning Hbase for optimized performance ( Part 4 )