we need to partition our Hive Table based on date. Date/Month/Year
is it better to use int or string for the partition types.
ex:
CREATE EXTERNAL TABLE partition (id string, event timestamp and so on)
PARTITIONED BY (year INT, month INT, day INT)
Stored as Parquet
vs
CREATE EXTERNAL TABLE partition (id string, event timestamp and so on)
PARTITIONED BY (year string, month string, day string)
Stored as Parquet
Noticed that we couldn't do queries like:
... where day > 10 with the string option