Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

I am trying to split file size to 64mb

avatar
New Member

set mapred.max.split.size=67108864; set mapred.min.split.size=1024; set hive.execution.engine=tez; insert overwrite table bdd.signal_hte partition(cvdt36_year,cvdt36_mon,cvdt36_day) select * from cv.signal_hte where cvdt36_year= "2015" and cvdt36_mon =05;

1 ACCEPTED SOLUTION

avatar
@Akhil Reddy

For tez, you need to use below parameter to set min and max splits of data:

  1. set tez.grouping.min-size=16777216;--16 MB min split
  2. set tez.grouping.max-size=64000000;--64 GB max split

Increase min and max split size to reduce the number of mappers.

View solution in original post

5 REPLIES 5

avatar
@Akhil Reddy

For tez, you need to use below parameter to set min and max splits of data:

  1. set tez.grouping.min-size=16777216;--16 MB min split
  2. set tez.grouping.max-size=64000000;--64 GB max split

Increase min and max split size to reduce the number of mappers.

avatar
New Member

1003608529

still the size remain same

avatar
New Member

if I do it for one day record it is working If I do for entire one year it still remain same.

avatar
New Member

insert overwrite table Mynewtable partition(cvdt36_year,cvdt36_mon,cvdt36_day) select * from MainTable where cvdt36_year= "2015" and cvdt36_mon =05 and cvdt36_day=16;

If I run this query it is working fine

insert overwrite table Mynewtable partition(cvdt36_year,cvdt36_mon,cvdt36_day) select * from MainTable where cvdt36_year= 2015; it is not working can you suggest me with correct query.

avatar

@Akhil Reddy

There is no syntax issue with query, could you please share the issue you are facing?