Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How much time passes after Hive inserts data - until Impala can present it ?

Highlighted

How much time passes after Hive inserts data - until Impala can present it ?

Explorer

It's about Cloudera Hadoop 5.4.4.
We've noticed that it takes some time since the moment when Hive inserts data into Hadoop - until Impala can present that data.
Is that time period configurable ? Is it possible to have an impact on that time period ?

Many thanks and looking forward your assistance,
Avi Vainshtein

2 REPLIES 2
Highlighted

Re: How much time passes after Hive inserts data - until Impala can present it ?

If Impala has already loaded the table, the cached copy won't be automatically updated. If you added new data to the table from outside of Impala, you need to use REFRESH: https://www.cloudera.com/documentation/enterprise/latest/topics/impala_refresh.html#refresh. If you changed other metadata, you may need to use INVALIDATE METADATA <table name>:  https://www.cloudera.com/documentation/enterprise/latest/topics/impala_invalidate_metadata.html

Re: How much time passes after Hive inserts data - until Impala can present it ?

Explorer

Many thanks Tim.

 

Is there any possibility to set/define an automatic Refresh in Impala ?

 

And also, we've noticed that after some time - the information in Impala is synchronized with the made-before Hive inserts, and that seems to be done without explicit Refresh.

How can that be explained ?

 

Don't have an account?
Coming from Hortonworks? Activate your account here