- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
impala-shell returns impalad: TSocket read 0 bytes
- Labels:
-
Apache Hadoop
-
Apache Impala
Created on ‎01-14-2016 10:23 AM - edited ‎09-16-2022 02:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello
system: centos 6.6
Hadoop 2.5.0-cdh5.3.0
impalad version 2.1.0
java version "1.8.0_11"
Running a imple query on impala-shell I get the following error:
Connected to n1:21000
Query: select key from auth limit 4
Query finished, fetching results ...
Error communicating with impalad: TSocket read 0 bytes
Could not execute command: select key from auth limit 4
The query is quite simple but the dataset is quite big.
impala log file reports the following error:
==> impalad.INFO <==
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f7ccbc94d20, pid=19181, tid=140170971166464
#
# JRE version: Java(TM) SE Runtime Environment (8.0_11-b12) (build 1.8.0_11-b12)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.11-b03 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libc.so.6+0x89d20] memcpy+0x3c0
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /var/run/cloudera-scm-agent/process/8600-impala-IMPALAD/hs_err_pid19181.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.sun.com/bugreport/crash.jsp
#
Doen anyone have any idea on how to solve thi error?
Thanks
Created ‎01-18-2016 09:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you can switch to Parquet, that's probably the best solution: it's generally the most performant file format for reading and produces the smallest file sizes. If for some reason you need to stick with text, the uncompressed data size needs to be < 1GB per file.
Created ‎01-14-2016 10:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It would be helpful if you had the hs_err_pid*.log file that is mentioned in the error message.
What format is the auth table? Is there anything notable about the data? E.g. large strings.
Created ‎01-14-2016 11:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Tim,
Here is the link for the file:
https://drive.google.com/file/d/0B1h4gv1ES8DeOVJyR1BMZHYwNHM/view?usp=sharing
The table is stored in text format and the result should be a siries of integers
Hope this helps (I'm kind of new to haddop)
Thanks
Created ‎01-14-2016 02:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If the table is a large compressed text file, you're probably running into this issue: https://issues.cloudera.org/browse/IMPALA-2249 . We have a fix in newer versions of Impala to prevent the crash, but we don't support compressed text files of > 1GB for some compressed text file formats.
Created ‎01-18-2016 08:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Tim,
Thanks.
We've actually confirmed that the datasets are compressed text files.
What would you recomend? Converting the text datasets to parquet? is this possible?
Again thanks for all your help.
Best regards,
Pedro Silva
Created ‎01-18-2016 09:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you can switch to Parquet, that's probably the best solution: it's generally the most performant file format for reading and produces the smallest file sizes. If for some reason you need to stick with text, the uncompressed data size needs to be < 1GB per file.
