Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How much time passes after Hive inserts data - until Impala can present it ?

Highlighted

How much time passes after Hive inserts data - until Impala can present it ?

Explorer

It's about Cloudera Hadoop 5.4.4.
We've noticed that it takes some time since the moment when Hive inserts data into Hadoop - until Impala can present that data.
Is that time period configurable ? Is it possible to have an impact on that time period ?

Many thanks and looking forward your assistance,
Avi Vainshtein

2 REPLIES 2

Re: How much time passes after Hive inserts data - until Impala can present it ?

Master Collaborator

If Impala has already loaded the table, the cached copy won't be automatically updated. If you added new data to the table from outside of Impala, you need to use REFRESH: https://www.cloudera.com/documentation/enterprise/latest/topics/impala_refresh.html#refresh. If you changed other metadata, you may need to use INVALIDATE METADATA <table name>:  https://www.cloudera.com/documentation/enterprise/latest/topics/impala_invalidate_metadata.html

Re: How much time passes after Hive inserts data - until Impala can present it ?

Explorer

Many thanks Tim.

 

Is there any possibility to set/define an automatic Refresh in Impala ?

 

And also, we've noticed that after some time - the information in Impala is synchronized with the made-before Hive inserts, and that seems to be done without explicit Refresh.

How can that be explained ?

 

Don't have an account?
Coming from Hortonworks? Activate your account here