Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

Impala-shell invalid version number

Highlighted

Impala-shell invalid version number

New Contributor

As i am running the quarry , randomly i am getting an warning , /00000_0_1_copy_1 has an invalid version number: this could be due to stale metadata. try running "refresh _xml" Could not execute command: compute stats. 

 

i am very much curious to know about the root cause and it something which is blocking my work all the time 

3 REPLIES 3

Re: Impala-shell invalid version number

Guru
Do you have workflow that updates those files outside of Impala? If yes, you need to refresh or invalidate metadata on those table so that impala can see the latest version.

Re: Impala-shell invalid version number

New Contributor

Hi Arun

 

Which CDH version you are currently using?

 

We are experiencing the same issue and using CDH6.2.. I believe this is a bug but let me know which version you are using so that we can report the bug if needed.

 

Amjad


@Arun wrote:

As i am running the quarry , randomly i am getting an warning , /00000_0_1_copy_1 has an invalid version number: this could be due to stale metadata. try running "refresh _xml" Could not execute command: compute stats. 

 

i am very much curious to know about the root cause and it something which is blocking my work all the time 


 

Re: Impala-shell invalid version number

Master Collaborator

Like @EricL said, this would be caused by some process updating files in the table in the background without a refresh in Impala. E.g. if you have a job that writes files directly into the table and can either write incomplete files or has Impala see the files before they are completely written (preferably you write the files in a temporary directory then move them into the table directory). Some usage patterns for hive might cause issues, e.g. INSERT OVERWRITE.

 

There was a related issue in Impala that could occur if you did an "INSERT OVERWRITE" from hive without a refresh from Impala: https://issues.apache.org/jira/browse/IMPALA-8561Generally that workflow (insert overwrite without refresh) is problematic, but the symptoms were made more confusing by IMPALA-8561.