Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

slow query in hive

avatar
Master Collaborator

hi:

actually i have this query very slow, any suggestion to improve that?

select codinterno,sum(freq) as freq,codtxf 
from journey_oficina_v4 
where codnrbeenf = '3008' 
and (fechaoprcnf >= '2016-04-01 00:00:00.0' 
and fechaoprcnf < '2016-04-02 00:00:00.0' )
and codtxf = 'STR01OON' 
group by codinterno, codtxf 
order by freq desc

i just have a bucket for codnrbeenf and is not in the query so....

Thanks

1 ACCEPTED SOLUTION

avatar
Super Guru

as you are using Transactional Table you can not take advantage of sort by on fechaoprcnf column.Apart from partitioning try to create storage index on the table using tblproperties ("orc.create.index"="true","orc.compress"="ZLIB", "orc.stripe.size"="268435456", "orc.row.index.stride"="10000")

-- orc stripe and index.stride value in this case are default try to tune these value and compare performace results.

View solution in original post

17 REPLIES 17

avatar
Master Collaborator

hi:

you are right i have received this error:

Error: Error while compiling statement: FAILED: SemanticException [Error 10298]: ACID insert, update, delete not supported on tables that are sorted, table journey_oficina (state=42000,code=10298)

so, I have removed the "transactional"="true", and the statement was ok, but now i cant make ACID operaitons, also it have improve the query, but now HOW CAN I UPDATE DE TABLE??

avatar
Master Collaborator

Hi:

whaMany thanks i create that and it faster now, but i cant do ACID operation, so, what can i do??

CREATE EXTERNAL TABLE IF NOT EXISTS journey_oficina(
  FECHAOPRCNF timestamp,
  codnrbeenf string,
  CODINTERNO string,
  CODTXF string,
  FREQ BIGINT
  )
CLUSTERED BY (codnrbeenf)
SORTED BY (FECHAOPRCNF)
INTO 60 BUCKETS
stored as ORC
LOCATION '/RSI/tables/logs/journey_oficina'
TBLPROPERTIES ("immutable"="false", "orc.create.index"="true","orc.compress"="ZLIB", "orc.stripe.size"="67108864","orc.row.index.stride"="25000");

thanks

avatar
Super Guru

good to see that worked but due to limitation with Transactional table you can not use sorted by, may be in coming future version it will be supported.

avatar
Master Collaborator

so, this month it will be in production, so what happend if the diary batch fail this day???

avatar
Super Guru

@Roberto Sancho

I could not get your question here, could you please elobarate what you are asking?

avatar
Master Collaborator

I mean if i need to change into the table, i need to delete everything and insert again??

avatar
Super Guru

@Roberto Sancho

yes, please change the table DDL and use insert overwrite to insert the data again.

avatar
Super Guru

@Roberto Sancho

it seems that all your queries has been answered, could you please spare some time and accept a best answer in this thread.