Created on 01-10-2018 04:15 PM - edited 09-16-2022 05:43 AM
Whenever using \0 as a new line separator in the textfile-based Impala table, it seems to crash. Please help.
To reproduce:
create table tab_separated(id bigint, s string, n int, t timestamp, b boolean)
row format delimited
fields terminated by '\t' escaped by '\\' lines terminated by '\000'
stored as textfile;
-- Success.
select * from tab_separated; -- Done. 0 results.
insert into tab_separated (id, s) values (100, ''); -- Success.
select * from tab_separated; -- 20 second delay before getting "Cancelled due to unreachable impalad(s): xxxx:22000"
Created on 01-10-2018 07:56 PM - edited 01-10-2018 09:33 PM
I probably should mention that the query is initiated from Hue UI on a node called xxx-02.mycompany.com and the error is mentioning not being able to use xxx-01.mycompany.com or xxx-03.mycompany.com -- all are parts of the same cluster.
And if I change "lines terminated by" to \n or \001 then everything works fine. Hmm, maybe I'll use \001 as a workaround for now! Maybe \0 on Linux means something different (like an abrupt EOF) when Impala reads text files...
Created 01-11-2018 01:30 PM
I was able to reproduce it myself on several versions of CDH. I filed a bug report to track it: https://issues.apache.org/jira/browse/IMPALA-6389 . Thank you very much for letting us know about this.
Created 01-10-2018 06:12 PM
Thanks for letting us know about this and the clear steps.
I wasn't able to reproduce the exact behaviour on my development version of Impala. What version of Impala are you seeing this in so that I can try to reproduce what you're seeing?
Created on 01-10-2018 07:29 PM - edited 01-10-2018 07:42 PM
Hope this helps.
impalad version 2.8.0-cdh5.11.0 RELEASE (build e09660de6b503a15f07e84b99b63e8e745854c34) Built on Wed Apr 5 19:51:24 PDT 2017
Cpu Info: Model: Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz Cores: 40 L1 Cache: 32.00 KB (Line: 64.00 B) L2 Cache: 256.00 KB (Line: 64.00 B) L3 Cache: 25.00 MB (Line: 64.00 B) Hardware Supports: ssse3 sse4_1 sse4_2 popcnt avx avx2 Physical Memory: 251.99 GB Disk Info: Num disks 14: sdm (rotational=true) sda (rotational=true) sdb (rotational=true) sdd (rotational=true) sdc (rotational=true) sde (rotational=true) sdg (rotational=true) sdf (rotational=true) sdk (rotational=true) sdj (rotational=true) sdi (rotational=true) sdh (rotational=true) sdl (rotational=true) dm- (rotational=true)
OS version: Linux version 2.6.32-431.el6.x86_64 (mockbuild@ca-build44.us.oracle.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Wed Nov 20 23:56:07 PST 2013 Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC
I0110 21:19:58.796617 32064 impala-hs2-server.cc:219] TExecuteStatementReq: TExecuteStatementReq { 01: sessionHandle (struct) = TSessionHandle { 01: sessionId (struct) = THandleIdentifier { 01: guid (string) = "\xc0\xf6SJ\xa3\xfcE\xd7\xba\xde\x1ea\x18\xfcq\xe5", 02: secret (string) = "\x1c\xee\xc4\xd0(FL*\xa5*T\x9ek\xe2\x0ez", }, }, 02: statement (string) = "select * from tab_separated", 03: confOverlay (map) = map<string,string>[2] { "QUERY_TIMEOUT_S" -> "600", "impala.resultset.cache.size" -> "100000", }, 04: runAsync (bool) = true, } I0110 21:19:58.796835 32064 impala-hs2-server.cc:252] TClientRequest.queryOptions: TQueryOptions { 01: abort_on_error (bool) = false, 02: max_errors (i32) = 0, 03: disable_codegen (bool) = false, 04: batch_size (i32) = 0, 05: num_nodes (i32) = 0, 06: max_scan_range_length (i64) = 0, 07: num_scanner_threads (i32) = 0, 08: max_io_buffers (i32) = 0, 09: allow_unsupported_formats (bool) = false, 10: default_order_by_limit (i64) = -1, 11: debug_action (string) = "", 12: mem_limit (i64) = 0, 13: abort_on_default_limit_exceeded (bool) = false, 15: hbase_caching (i32) = 0, 16: hbase_cache_blocks (bool) = false, 17: parquet_file_size (i64) = 0, 18: explain_level (i32) = 1, 19: sync_ddl (bool) = false, 23: disable_cached_reads (bool) = false, 24: disable_outermost_topn (bool) = false, 25: rm_initial_mem (i64) = 0, 26: query_timeout_s (i32) = 600, 28: appx_count_distinct (bool) = false, 29: disable_unsafe_spills (bool) = false, 31: exec_single_node_rows_threshold (i32) = 100, 32: optimize_partition_key_scans (bool) = false, 33: replica_preference (i32) = 0, 34: schedule_random_replica (bool) = false, 35: scan_node_codegen_threshold (i64) = 1800000, 36: disable_streaming_preaggregations (bool) = false, 37: runtime_filter_mode (i32) = 2, 38: runtime_bloom_filter_size (i32) = 1048576, 39: runtime_filter_wait_time_ms (i32) = 0, 40: disable_row_runtime_filtering (bool) = false, 41: max_num_runtime_filters (i32) = 10, 42: parquet_annotate_strings_utf8 (bool) = false, 43: parquet_fallback_schema_resolution (i32) = 0, 45: s3_skip_insert_staging (bool) = true, 46: runtime_filter_min_size (i32) = 1048576, 47: runtime_filter_max_size (i32) = 16777216, 48: prefetch_mode (i32) = 1, 49: strict_mode (bool) = false, 50: scratch_limit (i64) = -1, 51: enable_expr_rewrites (bool) = true, 52: decimal_v2 (bool) = false, 53: parquet_array_resolution (i32) = 2, } I0110 21:19:58.801021 32064 Frontend.java:890] Compiling query: select * from tab_separated I0110 21:19:58.801923 32064 Frontend.java:921] Compiled query. I0110 21:19:58.805642 32064 coordinator.cc:441] Exec() query_id=d34914b27029b479:5514eb1e00000000 stmt=select * from tab_separated I0110 21:19:58.806084 32064 coordinator.cc:592] starting 2 fragment instances for query d34914b27029b479:5514eb1e00000000 I0110 21:19:58.808128 36488 fragment-mgr.cc:40] ExecPlanFragment() instance_id=d34914b27029b479:5514eb1e00000000 coord=<##REDACTED##>:22000 I0110 21:19:58.808372 11220 plan-fragment-executor.cc:119] Prepare(): query_id=d34914b27029b479:5514eb1e00000000 instance_id=d34914b27029b479:5514eb1e00000000 I0110 21:19:58.808653 32064 coordinator.cc:630] started 2 fragment instances for query d34914b27029b479:5514eb1e00000000 I0110 21:19:58.808758 11220 plan-fragment-executor.cc:175] descriptor table for fragment=d34914b27029b479:5514eb1e00000000 tuples: Tuple(id=0 size=46 slots=[Slot(id=0 type=BIGINT col_path=[0] offset=32 null=(offset=45 mask=4) slot_idx=2 field_idx=-1), Slot(id=1 type=STRING col_path=[1] offset=0 null=(offset=45 mask=1) slot_idx=0 field_idx=-1), Slot(id=2 type=INT col_path=[2] offset=40 null=(offset=45 mask=8) slot_idx=3 field_idx=-1), Slot(id=3 type=TIMESTAMP col_path=[3] offset=16 null=(offset=45 mask=2) slot_idx=1 field_idx=-1), Slot(id=4 type=BOOLEAN col_path=[4] offset=44 null=(offset=45 mask=10) slot_idx=4 field_idx=-1)] tuple_path=[]) I0110 21:19:58.866487 11220 plan-fragment-executor.cc:300] Open(): instance_id=d34914b27029b479:5514eb1e00000000 I0110 21:19:58.867712 32064 impala-server.cc:895] Query d34914b27029b479:5514eb1e00000000 has timeout of 10m I0110 21:19:58.868438 32064 impala-hs2-server.cc:477] ExecuteStatement(): return_val=TExecuteStatementResp { 01: status (struct) = TStatus { 01: statusCode (i32) = 0, }, 02: operationHandle (struct) = TOperationHandle { 01: operationId (struct) = THandleIdentifier { 01: guid (string) = "y\xb4)p\xb2\x14I\xd3\x00\x00\x00\x00\x1e\xeb\x14U", 02: secret (string) = "y\xb4)p\xb2\x14I\xd3\x00\x00\x00\x00\x1e\xeb\x14U", }, 02: operationType (i32) = 0, 03: hasResultSet (bool) = true, }, } I0110 21:19:59.821496 32330 status.cc:47] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 734654c895329e72:1 done: false) @ 0x83758a (unknown) @ 0xaa5569 (unknown) @ 0xd17be9 (unknown) @ 0xd148b9 (unknown) @ 0x80468c (unknown) @ 0x9f669f (unknown) @ 0x9f0cc9 (unknown) @ 0x9f1722 (unknown) @ 0xbd3d29 (unknown) @ 0xbd4704 (unknown) @ 0xe1db9a (unknown) @ 0x3efd207aa1 (unknown) @ 0x3efcae893d (unknown) I0110 21:19:59.821506 32326 status.cc:47] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 5642f1f17845f4b7:2 done: false) @ 0x83758a (unknown) @ 0xaa5569 (unknown) @ 0xd17be9 (unknown) @ 0xd148b9 (unknown) @ 0x80468c (unknown) @ 0x9f669f (unknown) @ 0x9f0cc9 (unknown) @ 0x9f1722 (unknown) @ 0xbd3d29 (unknown) @ 0xbd4704 (unknown) @ 0xe1db9a (unknown) @ 0x3efd207aa1 (unknown) @ 0x3efcae893d (unknown) I0110 21:19:59.821622 32330 impala-server.cc:1090] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 734654c895329e72:1 done: false) I0110 21:19:59.821624 32326 impala-server.cc:1090] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 5642f1f17845f4b7:2 done: false) I0110 21:19:59.821790 35533 plan-fragment-executor.cc:461] Cancelling fragment instance... I0110 21:19:59.821801 35533 plan-fragment-executor.cc:475] Cancel(): instance_id=5642f1f17845f4b7:2 I0110 21:19:59.821811 35533 data-stream-mgr.cc:258] cancelling all streams for fragment=5642f1f17845f4b7:2 I0110 21:19:59.821802 34441 plan-fragment-executor.cc:461] Cancelling fragment instance... I0110 21:19:59.821823 34441 plan-fragment-executor.cc:475] Cancel(): instance_id=734654c895329e72:1 I0110 21:19:59.821842 34441 data-stream-mgr.cc:258] cancelling all streams for fragment=734654c895329e72:1 I0110 21:20:00.028481 18960 status.cc:47] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: f442c4a5c9b4c8e8:1 done: false) @ 0x83758a (unknown) @ 0xaa5569 (unknown) @ 0xd17be9 (unknown) @ 0xd148b9 (unknown) @ 0x80468c (unknown) @ 0x9f669f (unknown) @ 0x9f0cc9 (unknown) @ 0x9f1722 (unknown) @ 0xbd3d29 (unknown) @ 0xbd4704 (unknown) @ 0xe1db9a (unknown) @ 0x3efd207aa1 (unknown) @ 0x3efcae893d (unknown) I0110 21:20:00.028571 18960 impala-server.cc:1090] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: f442c4a5c9b4c8e8:1 done: false) I0110 21:20:00.028717 2082 plan-fragment-executor.cc:461] Cancelling fragment instance... I0110 21:20:00.028738 2082 plan-fragment-executor.cc:475] Cancel(): instance_id=f442c4a5c9b4c8e8:1 I0110 21:20:00.028754 2082 data-stream-mgr.cc:258] cancelling all streams for fragment=f442c4a5c9b4c8e8:1
Created 01-10-2018 07:30 PM
Created on 01-10-2018 07:56 PM - edited 01-10-2018 09:33 PM
I probably should mention that the query is initiated from Hue UI on a node called xxx-02.mycompany.com and the error is mentioning not being able to use xxx-01.mycompany.com or xxx-03.mycompany.com -- all are parts of the same cluster.
And if I change "lines terminated by" to \n or \001 then everything works fine. Hmm, maybe I'll use \001 as a workaround for now! Maybe \0 on Linux means something different (like an abrupt EOF) when Impala reads text files...
Created 01-11-2018 11:19 AM
@1stSolothanks for the info, I'll look into it further. It definitely looks like a bug causing an Impala crash so I want to get to the bottom of it. Your workaround of using a different terminator should work.
Created 01-11-2018 01:30 PM
I was able to reproduce it myself on several versions of CDH. I filed a bug report to track it: https://issues.apache.org/jira/browse/IMPALA-6389 . Thank you very much for letting us know about this.
Created 01-11-2018 03:58 PM
Created 01-11-2018 04:07 PM
That is a good suggestion, I went ahead and did it.