Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

urgent need: When try to run simple hive query: select count(*) form table , its shows the Error: Invalid distance too far back..

avatar
Expert Contributor

When i triger simple select count(*) form table on "database.table" its throws error "Invalid distance too far back.."

i have attached error log..please help me ..it is very high priority error for me..its really appreciated if you help me..6854-errorlog.png

i have below info ..if its is useful to you

reduce.am.max-attempts= 2

yarn.resourcemanager.am.max-attempts =2

1 ACCEPTED SOLUTION

avatar
Super Guru

@sankar rao

An archive has been corrupted. You probably store compressed files (e.g. gzip or lzo) in Hive table directory and at least one of those files is corrupted. I would start moving files out of that folder (HDFS) in reverse chronological order and repeat the query until successful. That way you can find the corrupted archive. There are other ways to test your archives. You could do that too.

Try and let me know.

If this response or any response in this thread, please don't forget to vote and accept best answer.

View solution in original post

4 REPLIES 4

avatar

Your zip file may be corrupted. How were the files imported into HDFS and hive?

Check out this article:

https://community.hortonworks.com/questions/52722/when-try-to-run-simple-hive-query-select-count-for...

Here is a way to load compressed zip files into hive:

https://cwiki.apache.org/confluence/display/Hive/CompressedStorage

avatar
Super Guru

@sankar rao

An archive has been corrupted. You probably store compressed files (e.g. gzip or lzo) in Hive table directory and at least one of those files is corrupted. I would start moving files out of that folder (HDFS) in reverse chronological order and repeat the query until successful. That way you can find the corrupted archive. There are other ways to test your archives. You could do that too.

Try and let me know.

If this response or any response in this thread, please don't forget to vote and accept best answer.

avatar
Expert Contributor

Thank you @Constantin Stanca

Actually i am new to hdp distribution ..i have below concern

#We loaded data into hive as textfile (uncompressed format).

#can you expand this answer "I would start moving files out of that folder (HDFS) in reverse chronological order and repeat the query until successful"

# I have my query and existing hive file system

select count(*) from db_c720_krux.events as a where site_name like ('Ka2XfElb') and day = '2016-06-17';issue4.png

avatar
Super Guru

@rama

Yes. That's the idea. One of the files is somehow corrupted.