Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

impala has invalid file metadata

avatar
Expert Contributor

CDH6.2.1

imapla 3.2.0-CDH6.2.1

 

the below errors happened every day,  i also find some similar issue on issue.apache.org.  it seems a bug .

 

3AE39A18-F5AC-47FF-BA83-E4B5B358831B.png

 

here is some issue list :

https://issues.apache.org/jira/browse/IMPALA-8561

https://github.com/cloudera/Impala/commit/9e4bf4c99f48a29e1c1d46d49c5f8452b777e2db

 

it seems this issue also happened on Impala3.3, not juse impala 3.2, but it's fixed in 3.3.

 

so, Cloudera support, how to fix this issue on imapla-3.2( CDH6.2.1), this issue is so critical cause many users encounter this issue and ask me what's happening, and i just can tell them this is a bug and i can't do anything.

 

may i ask another question ? if Impala3.2 can't resolve this issue ,may i update imapla 3,2 to impala 3.3 , i mean i don't update Cloudera Manager  and CDH other components , just update impala .

 

i hope Cloudera Supporter can give me advises , thank you very much .

1 ACCEPTED SOLUTION

avatar

I updated the JIRA to include workarounds, just FYI. 

View solution in original post

7 REPLIES 7

avatar

This generally happens when overwriting files in-place where Impala is still trying to read a cached version of the file. E.g. insert overwrite in Hive. So you can often avoid the problem if you can avoid doing that.

 

Otherwise doing a REFRESH of the table should resolve it.

avatar
Expert Contributor
i have found your workaround in JIRA, and will try. as you have mentioned REFRESH should resolve this issue, but actually it can't resolve, even i use "INVALIDATE METADATA" . that's why i open this session. anyway, thanks your solution.

avatar
Expert Contributor

Hi,

 

after set the parameter "-max_cached_file_handles=0 " as your workaround shows me, I got another issue, it's agent heartbeat timeout.

the ticket URL as below:

<a href="https://community.cloudera.com/t5/Support-Questions/Cloudera-Manager-agent-bad-healthy/m-p/283865#M210854" target="_blank">https://community.cloudera.com/t5/Support-Questions/Cloudera-Manager-agent-bad-healthy/m-p/283865#M210854</a>

 

 My CDH env has been online more than half year, agent heartbeat timeout has never been happened, but after comparing the date of setting  the impala parameter and agent  heartbeat issue date , it seems there are connection, but I am not sure .

 

1.png 

 

what I mean is the agent heartbeat timeout issue happened after I set the impala parameter "-max_cached_file_handles=0 ".

 

is that impossible ?

avatar

I updated the JIRA to include workarounds, just FYI. 

avatar

Also if you have a support contract with Cloudera, this is something they can help you with in more detail through that channel, we've successfully resolved this for customers before.

avatar
Expert Contributor

this issue can be sure has been resolved by your workaround, thanks

avatar
Community Manager

@iamfromsky If your issue ahas been resolved, can you please accept the appropriate reply as the solution so it will be easier for others in a similar situation to find in the future?

 

Screen Shot 2019-05-10 at 4.47.35 PM.png

 

 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.