Reply
Explorer
Posts: 11
Registered: ‎04-18-2016

How much time passes after Hive inserts data - until Impala can present it ?

It's about Cloudera Hadoop 5.4.4.
We've noticed that it takes some time since the moment when Hive inserts data into Hadoop - until Impala can present that data.
Is that time period configurable ? Is it possible to have an impact on that time period ?

Many thanks and looking forward your assistance,
Avi Vainshtein

Cloudera Employee
Posts: 357
Registered: ‎07-29-2015

Re: How much time passes after Hive inserts data - until Impala can present it ?

If Impala has already loaded the table, the cached copy won't be automatically updated. If you added new data to the table from outside of Impala, you need to use REFRESH: https://www.cloudera.com/documentation/enterprise/latest/topics/impala_refresh.html#refresh. If you changed other metadata, you may need to use INVALIDATE METADATA <table name>:  https://www.cloudera.com/documentation/enterprise/latest/topics/impala_invalidate_metadata.html

Highlighted
Explorer
Posts: 11
Registered: ‎04-18-2016

Re: How much time passes after Hive inserts data - until Impala can present it ?

Many thanks Tim.

 

Is there any possibility to set/define an automatic Refresh in Impala ?

 

And also, we've noticed that after some time - the information in Impala is synchronized with the made-before Hive inserts, and that seems to be done without explicit Refresh.

How can that be explained ?

 

Announcements