Support Questions

Find answers, ask questions, and share your expertise

How much time passes after Hive inserts data - until Impala can present it ?


It's about Cloudera Hadoop 5.4.4.
We've noticed that it takes some time since the moment when Hive inserts data into Hadoop - until Impala can present that data.
Is that time period configurable ? Is it possible to have an impact on that time period ?

Many thanks and looking forward your assistance,
Avi Vainshtein


If Impala has already loaded the table, the cached copy won't be automatically updated. If you added new data to the table from outside of Impala, you need to use REFRESH: If you changed other metadata, you may need to use INVALIDATE METADATA <table name>:


Many thanks Tim.


Is there any possibility to set/define an automatic Refresh in Impala ?


And also, we've noticed that after some time - the information in Impala is synchronized with the made-before Hive inserts, and that seems to be done without explicit Refresh.

How can that be explained ?


Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.