Support Questions

Find answers, ask questions, and share your expertise

Get Last Insert in Impala partition

avatar
Rising Star

Hi,

 

I need to retrieve only the last entry in a given partition if there are multiple entries therein

 

Assume I create an external table partitioned by date:
create external table test_lb (field1 string, field2 string, field3 string)
  partitioned by (year string, month string , day string, host string)
  row format delimited fields terminated by ','

Then I insert multiple records to same partition. i,e. same year,month,day
insert into test_lb partition (year="2013", month="07", day="28") values ("foo1", "FOO2", "FOO3");
insert into test_lb partition (year="2013", month="07", day="28") values ("foo4", "FOO5", "FOO6");

 

How do I retrieve just the most recent entry via a query... is there an inbuilt way to get only the latest values

1 ACCEPTED SOLUTION

avatar
This isn't possible unless you include a timestamp or sequence number in every record. There's no concept of an order of rows built into Hive or Impala.

View solution in original post

2 REPLIES 2

avatar
This isn't possible unless you include a timestamp or sequence number in every record. There's no concept of an order of rows built into Hive or Impala.

avatar
Rising Star

Thanks