Created 01-03-2017 08:56 PM
Created 01-04-2017 01:21 AM
@clukasik- Thanks for your reply.
Table is loaded 3 times today.I would like to know count for each partition. I tried to run a query "select count(*) from db.tablename where partition_dt="2017-01-03 10%3A20%3A13" , tried like keyword both did not give results.
Note: select count(*)f rom db.table where partition_dt ='2017-01-03' gives count for the day.
I Agree with your comment, but it is an existing application. I will make sure new design won't have too many partitions.
Created 01-03-2017 09:23 PM
To query across multiple partitions, you should not need to anything special, other than make sure your where clause is not forcing you into a specific folder. E.g. do not include partition_dt in your query.
One comment: Avoid too many partitions. Avoid partitioning that is too granular (unless you are pruning old data) as you will suffer performance problems. I recommend taking a look at this article for some best practices on Hive partitioning.
Created 01-04-2017 01:21 AM
@clukasik- Thanks for your reply.
Table is loaded 3 times today.I would like to know count for each partition. I tried to run a query "select count(*) from db.tablename where partition_dt="2017-01-03 10%3A20%3A13" , tried like keyword both did not give results.
Note: select count(*)f rom db.table where partition_dt ='2017-01-03' gives count for the day.
I Agree with your comment, but it is an existing application. I will make sure new design won't have too many partitions.
Created 01-04-2017 05:13 PM
Would this work?
select partition_dt, count(*) from db.tablename group by partition_dt
Created 01-04-2017 07:56 PM
@clukasik - Thanks for the reply. I tried your query and it worked , here is my observations.
show partitions tablename;
partition_dt=2017-01-03 10%3A20%3A13 partition_dt=2017-01-03 10%3A20%3A14 partition_dt=2017-01-03 10%3A20%3A15
select patition_dt, count(1) from tablename group by partition_dt;
2017-01-03 10:20:13.0 | 2191 2017-01-03 10:20:14.0 | 7888 2017-01-03 10:20:15.0 | 1647
Note: %3A should be replaced with : while quering.
select count(1) from tablename where partition_dt="2017-01-03 10:20:13"; +-------+--+ | _c0 | +-------+--+ | 2191 | +-------+--+ 1 row selected (3.643 seconds)
select count(1) from tablename where partition_dt="2017-01-03 10:20:14"; +-------+--+ | _c0 | +-------+--+ | 7888 | +-------+--+ 1 row selected (0.101 seconds)
select count(1) from tablename where partition_dt="2017-01-03 10:20:15"; +-------+--+ | _c0 | +-------+--+ | 1647 | +-------+--+ 1 row selected (0.101 seconds)
Sampling i used below query.
select * from tablename where partition_dt="2017-01-03 10:20:15" limit 5;