Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive Data Partitioning issues using date

Hive Data Partitioning issues using date

New Contributor

Hello Everyone,

We are actually creating a history table partition by "system_date".

For this, an external table <Src_Tbl> has been created and all the folders are added.

We have added 4 years of dates to this table (total ~1400 Partitions. Total size for all the partitions is 4 TB). We also ran ANALYZE table on all the partitions for this table.

But, even a simple extract query from this <src_tbl> is creating huge number of mappers and taking 4-5 hours of time.

Query used:

select * from <src_tbl> where system_date='20180101' limit 5;

Explain Plan:

hive> explain select * from <src_tbl> where system_date='20180201' limit 5;

OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: <src_tbl> Statistics: Num rows: 4611834732 Data size: 3707224854947 Basic stats: PARTIAL Column stats: PARTIAL Filter Operator predicate: (system_date = '20180201') (type: boolean) Statistics: Num rows: 4611834732 Data size: 3707224854947 Basic stats: COMPLETE Column stats: PARTIAL

Statistics: Num rows: 4611834732 Data size: 3707224854947 Basic stats: COMPLETE Column stats: PARTIAL Limit Number of rows: 5 Statistics: Num rows: 5 Data size: 460 Basic stats: COMPLETE Column stats: PARTIAL File Output Operator compressed: false Statistics: Num rows: 5 Data size: 460 Basic stats: COMPLETE Column stats: PARTIAL table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: 5 Processor Tree: ListSink

Can anyone please suggest if I am missing anything here.