Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Error communicating with impalad: TSocket read 0 bytes

Solved Go to solution

Error communicating with impalad: TSocket read 0 bytes

Explorer

Hi,

 

When I query from impala-shell a table (select * from internet_usages limit 1) sometimes I get error.

Error communicating with impalad: TSocket read 0 bytes

 

After some time (half an hour or more) the query runs properly.

Sometimes I restart Impala, but it does not always help. In fact I am not sure if it helps at all.

Postgres database and pgpool look to work all time without problems.

 

Bellow impala logs when query finish with error and when finish proparly. I marked rows when queries start to differ.

 

Best regards

Andrzej

 

When I get an error there are following messages in impala logs.

II1123 12:41:34.452467 12473 impala-beeswax-server.cc:175] query(): query=select * from internet_usages limit 2
I1123 12:41:34.452606 12473 impala-beeswax-server.cc:519] query: Query {
01: query (string) = "select * from internet_usages limit 2",
03: configuration (list) = list<string>[0] {
},
04: hadoop_user (string) = "etl_user",
}
I1123 12:41:34.453100 12473 RequestPoolService.java:277] resolveRequestPool(pool=, user=etl_user): resolved_pool=root.etl_user, has_access=true
I1123 12:41:34.453505 12473 RequestPoolService.java:375] getPoolConfig(pool=root.etl_user): max_mem_resources=-1, max_requests=-1, max_queued=200, queue_timeout_ms=0, default_query_options=
I1123 12:41:34.453732 12473 impala-beeswax-server.cc:549] TClientRequest.queryOptions: TQueryOptions {
01: abort_on_error (bool) = false,
02: max_errors (i32) = 0,
03: disable_codegen (bool) = false,
04: batch_size (i32) = 0,
05: num_nodes (i32) = 0,
06: max_scan_range_length (i64) = 0,
07: num_scanner_threads (i32) = 0,
08: max_io_buffers (i32) = 0,
09: allow_unsupported_formats (bool) = false,
10: default_order_by_limit (i64) = -1,
11: debug_action (string) = "",
12: mem_limit (i64) = 0,
13: abort_on_default_limit_exceeded (bool) = false,
15: hbase_caching (i32) = 0,
16: hbase_cache_blocks (bool) = false,
17: parquet_file_size (i64) = 0,
18: explain_level (i32) = 1,
19: sync_ddl (bool) = false,
23: disable_cached_reads (bool) = false,
24: disable_outermost_topn (bool) = false,
25: rm_initial_mem (i64) = 0,
26: query_timeout_s (i32) = 0,
28: appx_count_distinct (bool) = false,
29: disable_unsafe_spills (bool) = false,
31: exec_single_node_rows_threshold (i32) = 100,
32: optimize_partition_key_scans (bool) = false,
33: replica_preference (i32) = 0,
34: schedule_random_replica (bool) = false,
35: scan_node_codegen_threshold (i64) = 1800000,
36: disable_streaming_preaggregations (bool) = false,
37: runtime_filter_mode (i32) = 2,
38: runtime_bloom_filter_size (i32) = 1048576,
39: runtime_filter_wait_time_ms (i32) = 0,
40: disable_row_runtime_filtering (bool) = false,
41: max_num_runtime_filters (i32) = 10,
42: parquet_annotate_strings_utf8 (bool) = false,
43: parquet_fallback_schema_resolution (i32) = 0,
44: mt_num_cores (i32) = 1,
45: s3_skip_insert_staging (bool) = true,
46: runtime_filter_min_size (i32) = 1048576,
47: runtime_filter_max_size (i32) = 16777216,
48: prefetch_mode (i32) = 1,
49: strict_mode (bool) = false,
}
I1123 12:41:34.457437 12473 Frontend.java:878] analyze query select * from internet_usages limit 2
I1123 12:41:34.459012 12473 Frontend.java:822] Requesting prioritized load of table(s): default.internet_usages
I1123 12:43:32.765156 30575 webserver.cc:365] Webserver: error reading: Resource temporarily unavailable
I1123 12:43:34.472626 12473 Frontend.java:897] Missing tables were not received in 120000ms. Load request will be retried.
I1123 12:43:34.475009 12473 Frontend.java:822] Requesting prioritized load of table(s): default.internet_usages
I1123 12:45:14.323235 13498 webserver.cc:365] Webserver: error reading: Resource temporarily unavailable
I1123 12:45:14.323235 30574 webserver.cc:365] Webserver: error reading: Resource temporarily unavailable
I1123 12:45:14.323247 13497 webserver.cc:365] Webserver: error reading: Resource temporarily unavailable
I1123 12:45:14.324121 30878 webserver.cc:365] Webserver: error reading: Resource temporarily unavailable
I1123 12:45:34.488405 12473 Frontend.java:897] Missing tables were not received in 120000ms. Load request will be retried.
I1123 12:45:34.490772 12473 Frontend.java:822] Requesting prioritized load of table(s): default.internet_usages
I1123 12:47:34.504266 12473 Frontend.java:897] Missing tables were not received in 120000ms. Load request will be retried.
I1123 12:47:34.506804 12473 Frontend.java:822] Requesting prioritized load of table(s): default.internet_usages

 

When query runs proparly I get the follwoing impala log:

I1123 12:04:30.549031 12472 impala-beeswax-server.cc:175] query(): query=select * from internet_usages limit 1
I1123 12:04:30.549149 12472 impala-beeswax-server.cc:519] query: Query {
01: query (string) = "select * from internet_usages limit 1",
03: configuration (list) = list<string>[0] {
},
04: hadoop_user (string) = "etl_user",
}
I1123 12:04:30.551203 12472 RequestPoolService.java:277] resolveRequestPool(pool=, user=etl_user): resolved_pool=root.etl_user, has_access=true
I1123 12:04:30.551853 12472 RequestPoolService.java:375] getPoolConfig(pool=root.etl_user): max_mem_resources=-1, max_requests=-1, max_queued=200, queue_timeout_ms=0, default_query_options=
I1123 12:04:30.552089 12472 impala-beeswax-server.cc:549] TClientRequest.queryOptions: TQueryOptions {
01: abort_on_error (bool) = false,
02: max_errors (i32) = 0,
03: disable_codegen (bool) = false,
04: batch_size (i32) = 0,
05: num_nodes (i32) = 0,
06: max_scan_range_length (i64) = 0,
07: num_scanner_threads (i32) = 0,
08: max_io_buffers (i32) = 0,
09: allow_unsupported_formats (bool) = false,
10: default_order_by_limit (i64) = -1,
11: debug_action (string) = "",
12: mem_limit (i64) = 0,
13: abort_on_default_limit_exceeded (bool) = false,
15: hbase_caching (i32) = 0,
16: hbase_cache_blocks (bool) = false,
17: parquet_file_size (i64) = 0,
18: explain_level (i32) = 1,
19: sync_ddl (bool) = false,
23: disable_cached_reads (bool) = false,
24: disable_outermost_topn (bool) = false,
25: rm_initial_mem (i64) = 0,
26: query_timeout_s (i32) = 0,
28: appx_count_distinct (bool) = false,
29: disable_unsafe_spills (bool) = false,
31: exec_single_node_rows_threshold (i32) = 100,
32: optimize_partition_key_scans (bool) = false,
33: replica_preference (i32) = 0,
34: schedule_random_replica (bool) = false,
35: scan_node_codegen_threshold (i64) = 1800000,
36: disable_streaming_preaggregations (bool) = false,
37: runtime_filter_mode (i32) = 2,
38: runtime_bloom_filter_size (i32) = 1048576,
39: runtime_filter_wait_time_ms (i32) = 0,
40: disable_row_runtime_filtering (bool) = false,
41: max_num_runtime_filters (i32) = 10,
42: parquet_annotate_strings_utf8 (bool) = false,
43: parquet_fallback_schema_resolution (i32) = 0,
44: mt_num_cores (i32) = 1,
45: s3_skip_insert_staging (bool) = true,
46: runtime_filter_min_size (i32) = 1048576,
47: runtime_filter_max_size (i32) = 16777216,
48: prefetch_mode (i32) = 1,
49: strict_mode (bool) = false,
}
I1123 12:04:30.556161 12472 Frontend.java:878] analyze query select * from internet_usages limit 1
I1123 12:04:30.560166 12472 Frontend.java:956] create plan
I1123 12:04:31.458932 12472 HdfsScanNode.java:365] collecting partitions for table internet_usages
I1123 12:04:31.488257 12472 HdfsScanNode.java:422] cardinality_=3488886438 sel=1.0
I1123 12:04:31.488364 12472 HdfsScanNode.java:429] computeStats HdfsScan: cardinality_=3488886438
I1123 12:04:31.488493 12472 HdfsScanNode.java:485] computeNumNodes totalRanges=1160171 localRanges=6 remoteRanges=0 localHostSet.size=6 clusterNodes=6
I1123 12:04:31.488543 12472 HdfsScanNode.java:432] computeStats HdfsScan: #nodes=6
I1123 12:04:31.488672 12472 HdfsScanNode.java:365] collecting partitions for table internet_usages
I1123 12:04:31.516621 12472 HdfsScanNode.java:422] cardinality_=3488886438 sel=1.0
I1123 12:04:31.516707 12472 HdfsScanNode.java:429] computeStats HdfsScan: cardinality_=1
I1123 12:04:31.516811 12472 HdfsScanNode.java:485] computeNumNodes totalRanges=1160171 localRanges=6 remoteRanges=0 localHostSet.size=6 clusterNodes=6
I1123 12:04:31.516849 12472 HdfsScanNode.java:432] computeStats HdfsScan: #nodes=6
I1123 12:04:31.517246 12472 Planner.java:137] desctbl: tuples:
TupleDescriptor{id=0, name=BaseTableRef default.internet_usages, tbl=default.internet_usages, byte_size=72, is_materialized=true, slots=[SlotDescriptor{id=0, path=default.internet_usages.id, type=BIGINT, materialized=true, byteSize=8, byteOffset=8, nullable=true, nullIndicatorByte=0, nullIndicatorBit=2, slotIdx=2, stats=ColumnStats{avgSerializedSize_=8.0, maxSize_=8, numDistinct_=3594411008, numNulls_=-1}}, SlotDescriptor{id=1, path=default.internet_usages.purchase_id, type=BIGINT, materialized=true, byteSize=8, byteOffset=16, nullable=true, nullIndicatorByte=0, nullIndicatorBit=3, slotIdx=3, stats=ColumnStats{avgSerializedSize_=8.0, maxSize_=8, numDistinct_=14064324, numNulls_=-1}}, SlotDescriptor{id=2, path=default.internet_usages.current_usage, type=BIGINT, materialized=true, byteSize=8, byteOffset=24, nullable=true, nullIndicatorByte=0, nullIndicatorBit=4, slotIdx=4, stats=ColumnStats{avgSerializedSize_=8.0, maxSize_=8, numDistinct_=118118152, numNulls_=-1}}, SlotDescriptor{id=3, path=default.internet_usages.created_at, type=TIMESTAMP, materialized=true, byteSize=16, byteOffset=40, nullable=true, nullIndicatorByte=0, nullIndicatorBit=6, slotIdx=6, stats=ColumnStats{avgSerializedSize_=16.0, maxSize_=16, numDistinct_=3523511296, numNulls_=-1}}, SlotDescriptor{id=4, path=default.internet_usages.updated_at, type=TIMESTAMP, materialized=true, byteSize=16, byteOffset=56, nullable=true, nullIndicatorByte=0, nullIndicatorBit=7, slotIdx=7, stats=ColumnStats{avgSerializedSize_=16.0, maxSize_=16, numDistinct_=3523511296, numNulls_=-1}}, SlotDescriptor{id=5, path=default.internet_usages.enabled_service_span_id, type=BIGINT, materialized=true, byteSize=8, byteOffset=32, nullable=true, nullIndicatorByte=0, nullIndicatorBit=5, slotIdx=5, stats=ColumnStats{avgSerializedSize_=8.0, maxSize_=8, numDistinct_=349686, numNulls_=-1}}, SlotDescriptor{id=6, path=default.internet_usages.purchase_id_mod_100_part, type=TINYINT, materialized=true, byteSize=1, byteOffset=1, nullable=true, nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0, stats=ColumnStats{avgSerializedSize_=1.0, maxSize_=1, numDistinct_=100, numNulls_=0}}, SlotDescriptor{id=7, path=default.internet_usages.month_number_part, type=TINYINT, materialized=true, byteSize=1, byteOffset=2, nullable=true, nullIndicatorByte=0, nullIndicatorBit=1, slotIdx=1, stats=ColumnStats{avgSerializedSize_=1.0, maxSize_=1, numDistinct_=12, numNulls_=0}}]}
I1123 12:04:31.517395 12472 Planner.java:138] resultexprs: SlotRef{path=id, type=BIGINT, id=0} SlotRef{path=purchase_id, type=BIGINT, id=1} SlotRef{path=current_usage, type=BIGINT, id=2} SlotRef{path=created_at, type=TIMESTAMP, id=3} SlotRef{path=updated_at, type=TIMESTAMP, id=4} SlotRef{path=enabled_service_span_id, type=BIGINT, id=5} SlotRef{path=purchase_id_mod_100_part, type=TINYINT, id=6} SlotRef{path=month_number_part, type=TINYINT, id=7}
I1123 12:04:31.517426 12472 Planner.java:139] finalize plan fragments
I1123 12:04:31.518618 12472 Frontend.java:999] get scan range locations
I1123 12:04:31.518740 12472 Planner.java:339] Estimated per-host peak memory requirement: 0
I1123 12:04:31.518777 12472 Planner.java:340] Estimated per-host virtual cores requirement: 0
I1123 12:04:31.522639 12472 Frontend.java:1077] create result set metadata
I1123 12:04:31.522734 12472 JniFrontend.java:151] Estimated Per-Host Requirements: Memory=0B VCores=0

00:SCAN HDFS [default.internet_usages]
partitions=1200/1200 files=1160171 size=131.63GB
table stats: 3488886438 rows total
column stats: all
limit: 1
hosts=6 per-host-mem=unavailable
tuple-ids=0 row-size=66B cardinality=1
I1123 12:04:35.428925 12472 RequestPoolService.java:277] resolveRequestPool(pool=, user=etl_user): resolved_pool=root.etl_user, has_access=true
I1123 12:04:36.630676 12472 RequestPoolService.java:375] getPoolConfig(pool=root.etl_user): max_mem_resources=-1, max_requests=-1, max_queued=200, queue_timeout_ms=0, default_query_options=
I1123 12:04:36.630795 12472 admission-controller.cc:439] Schedule for id=bd4414f7d03e553f:e25876e000000000 in pool_name=root.etl_user cluster_mem_needed=4.00 GB PoolConfig: max_requests=-1 max_queued=200 max_mem=-1.00 B
I1123 12:04:36.630810 12472 admission-controller.cc:444] Stats: agg_num_running=0, agg_num_queued=0, agg_mem_reserved=0, local_host(local_mem_admitted=0, num_admitted_running=0, num_queued=0, backend_mem_reserved=0)
I1123 12:04:36.630816 12472 admission-controller.cc:450] Admitted query id=bd4414f7d03e553f:e25876e000000000
I1123 12:04:36.630884 12472 coordinator.cc:434] Exec() query_id=bd4414f7d03e553f:e25876e000000000
I1123 12:04:37.247411 12472 plan-fragment-executor.cc:98] Prepare(): query_id=bd4414f7d03e553f:e25876e000000000 instance_id=bd4414f7d03e553f:e25876e000000000
I1123 12:04:37.401240 12472 plan-fragment-executor.cc:215] descriptor table for fragment=bd4414f7d03e553f:e25876e000000000
tuples:
Tuple(id=0 size=72 slots=[Slot(id=0 type=BIGINT col_path=[2] offset=8 null=(offset=0 mask=4) slot_idx=2 field_idx=-1), Slot(id=1 type=BIGINT col_path=[3] offset=16 null=(offset=0 mask=8) slot_idx=3 field_idx=-1), Slot(id=2 type=BIGINT col_path=[4] offset=24 null=(offset=0 mask=10) slot_idx=4 field_idx=-1), Slot(id=3 type=TIMESTAMP col_path=[5] offset=40 null=(offset=0 mask=40) slot_idx=6 field_idx=-1), Slot(id=4 type=TIMESTAMP col_path=[6] offset=56 null=(offset=0 mask=80) slot_idx=7 field_idx=-1), Slot(id=5 type=BIGINT col_path=[7] offset=32 null=(offset=0 mask=20) slot_idx=5 field_idx=-1), Slot(id=6 type=TINYINT col_path=[0] offset=1 null=(offset=0 mask=1) slot_idx=0 field_idx=-1), Slot(id=7 type=TINYINT col_path=[1] offset=2 null=(offset=0 mask=2) slot_idx=1 field_idx=-1)] tuple_path=[])
I1123 12:04:37.401443 12472 runtime-state.cc:215] Error from query bd4414f7d03e553f:e25876e000000000: Unknown disk id. This will negatively affect performance. Check your hdfs settings to enable block location metadata.
I1123 12:04:40.887898 12472 status.cc:114] Disabled by query option.
@ 0x84eab9 (unknown)
@ 0xc7fef5 (unknown)
@ 0xc63c3a (unknown)
@ 0xdfb5c0 (unknown)
@ 0xde8859 (unknown)
@ 0xb268b5 (unknown)
@ 0xb29344 (unknown)
@ 0xad48f6 (unknown)
@ 0xada168 (unknown)
@ 0xb197ac (unknown)
@ 0xd560e5 (unknown)
@ 0xd593e4 (unknown)
@ 0x81c3bc (unknown)
@ 0x1b514fb (unknown)
@ 0x1b39029 (unknown)
@ 0xa0efc9 (unknown)
@ 0xa0fa22 (unknown)
@ 0xbf6069 (unknown)
@ 0xbf6a04 (unknown)
@ 0xe5cd2a (unknown)
@ 0x7f80961b3dc5 start_thread
@ 0x7f8095ee121d __clone
I1123 12:04:41.080906 20576 plan-fragment-executor.cc:327] Open(): instance_id=bd4414f7d03e553f:e25876e000000000
I1123 12:04:41.185772 12472 impala-beeswax-server.cc:310] get_results_metadata(): query_id=bd4414f7d03e553f:e25876e000000000
I1123 12:04:43.239852 12472 plan-fragment-executor.cc:485] Finished executing fragment query_id=bd4414f7d03e553f:e25876e000000000 instance_id=bd4414f7d03e553f:e25876e000000000
I1123 12:04:43.241802 12472 plan-fragment-executor.cc:485] Finished executing fragment query_id=bd4414f7d03e553f:e25876e000000000 instance_id=bd4414f7d03e553f:e25876e000000000
I1123 12:04:43.241833 12472 data-stream-mgr.cc:258] cancelling all streams for fragment=bd4414f7d03e553f:e25876e000000000
I1123 12:04:43.241852 12472 coordinator.cc:1078] All backends finished successfully.
I1123 12:04:43.246424 12472 impala-beeswax-server.cc:352] close(): query_id=bd4414f7d03e553f:e25876e000000000
I1123 12:04:43.246454 12472 impala-server.cc:920] UnregisterQuery(): query_id=bd4414f7d03e553f:e25876e000000000
I1123 12:04:43.246461 12472 impala-server.cc:1006] Cancel(): query_id=bd4414f7d03e553f:e25876e000000000

 

When I get an error there in hive metastore logs last messages are:

2017-11-23 12:24:33,230 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-92]: ugi=impala      ip=10.113.8.167 cmd=source:10.113.8.167 get_partitions_by_names : db=default tbl=internet_usages

2017-11-23 12:24:52,951 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-92]: </PERFLOG method=get_partitions_by_names start=1511439873230 end=1511439892951 duration=19721 from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=92 retryCount=0 error=false>

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Error communicating with impalad: TSocket read 0 bytes

Explorer

The problem was probably caused by a table with over 1 mln small files in hdfs. When invalidate metadata command was started in impala refreshing metadata process lasted over 15 min. After compacting table problem disappeared.

2 REPLIES 2

Re: Error communicating with impalad: TSocket read 0 bytes

Champion

@Szczechla

 

The similar issue (Resource temporarily unavailable error message from Impala shell) has been discussed in the below link. It may help you.

 

Note: Your log shows the issue is due to Resource temporarily unavailable 

 

https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Resource-temporarily-unavailable-error...

Re: Error communicating with impalad: TSocket read 0 bytes

Explorer

The problem was probably caused by a table with over 1 mln small files in hdfs. When invalidate metadata command was started in impala refreshing metadata process lasted over 15 min. After compacting table problem disappeared.

Don't have an account?
Coming from Hortonworks? Activate your account here