- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
When try to run simple hive query: select count(*) form table , its shows the Error: Invalid distance too far back.
- Labels:
-
Apache Hadoop
-
Apache Hive
Created ‎08-22-2016 08:10 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When try to run simple hive query in tez: select count(*) form table , its shows the Error: Invalid distance too far back..can anyone help me out of this error.. i have attached the error log...please help me..errorlog.png
Created ‎08-22-2016 09:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It seems your zip file present in table directory is corrupted. Try decompress the file directly with unzip utility(you may get the file name from the failed container logs).
Created ‎08-22-2016 09:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It seems your zip file present in table directory is corrupted. Try decompress the file directly with unzip utility(you may get the file name from the failed container logs).
Created ‎08-22-2016 09:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks you so much@Ankit Singhal
But i have below concerns
#why should i decompress the file..?
# how can i get ..failed container logs because.. i could see 12 directory paths under "yarn.nodemanager.log-dirs" property and .i just confused where should i find the application logs ...please suggest me..
Created ‎08-22-2016 11:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
bq. why should i decompress the file..?
In order to confirm that file selected is actually corrupted or not.
bq. how can i get ..failed container logs because.. i could see 12 directory paths under "yarn.nodemanager.log-dirs" property and .i just confused where should i find the application logs
Actually, I don't remember the actual keyword to search in the logs ,but you can check syslogs for container with id similar to _14435*_237788_1_01_000062_1 and look for line saying "processing file" or something similar.
Created ‎08-22-2016 12:49 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks again for your time@Ankit Singhal
# I feel bad to say that i am unable to understand this reply"In order to confirm that file selected is actually corrupted or not."
You already said "It seems your zip file present in table directory is corrupted" so why should need this..and If corrupted is confirm..what is solution to execute this query ...
#my intense is to run query...how can i do it ..what are the steps should i take..
Created ‎08-26-2016 07:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Any updated answer to my query ?
Created ‎08-26-2016 01:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@sankar rao, Actually I don't have the yarn cluster ready to confirm you what log lines needs to be searched in the container logs for the file names. Probably , it will be better if you can raise a support case so that a dedicated team can look into the issue specifically. As it depends, How is the data loaded in the table/hdfs , how they are zipped, which input format you are using etc.?
