Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Impala: "Cancelled due to unreachable impalad(s)" when using \0 as line separator in text file/table

avatar
Explorer

Whenever using \0 as a new line separator in the textfile-based Impala table, it seems to crash. Please help.

 

To reproduce:

 

create table tab_separated(id bigint, s string, n int, t timestamp, b boolean)
  row format delimited
  fields terminated by '\t' escaped by '\\' lines terminated by '\000'
  stored as textfile;

-- Success.

 

select * from tab_separated; -- Done. 0 results.

 

insert into tab_separated (id, s) values (100, ''); -- Success.

 

select * from tab_separated; -- 20 second delay before getting "Cancelled due to unreachable impalad(s): xxxx:22000"

 

 

2 ACCEPTED SOLUTIONS

avatar
Explorer

I probably should mention that the query is initiated from Hue UI on a node called xxx-02.mycompany.com and the error is mentioning not being able to use xxx-01.mycompany.com or xxx-03.mycompany.com -- all are parts of the same cluster.

 

And if I change "lines terminated by" to \n or \001 then everything works fine. Hmm, maybe I'll use \001 as a workaround for now! Maybe \0 on Linux means something different (like an abrupt EOF) when Impala reads text files...

View solution in original post

avatar

I was able to reproduce it myself on several versions of CDH. I filed a bug report to track it: https://issues.apache.org/jira/browse/IMPALA-6389 . Thank you very much for letting us know about this.

View solution in original post

8 REPLIES 8

avatar

Thanks for letting us know about this and the clear steps.

 

I wasn't able to reproduce the exact behaviour on my development version of Impala. What version of Impala are you seeing this in so that I can try to reproduce what you're seeing?

avatar
Explorer

Hope this helps.

 

impalad version 2.8.0-cdh5.11.0 RELEASE (build e09660de6b503a15f07e84b99b63e8e745854c34)
Built on Wed Apr  5 19:51:24 PDT 2017

 

Cpu Info:
  Model: Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
  Cores: 40
  L1 Cache: 32.00 KB (Line: 64.00 B)
  L2 Cache: 256.00 KB (Line: 64.00 B)
  L3 Cache: 25.00 MB (Line: 64.00 B)
  Hardware Supports:
    ssse3
    sse4_1
    sse4_2
    popcnt
    avx
    avx2
 Physical Memory: 251.99 GB
 Disk Info: 
  Num disks 14: 
    sdm (rotational=true)
    sda (rotational=true)
    sdb (rotational=true)
    sdd (rotational=true)
    sdc (rotational=true)
    sde (rotational=true)
    sdg (rotational=true)
    sdf (rotational=true)
    sdk (rotational=true)
    sdj (rotational=true)
    sdi (rotational=true)
    sdh (rotational=true)
    sdl (rotational=true)
    dm- (rotational=true)

 

OS version: Linux version 2.6.32-431.el6.x86_64 (mockbuild@ca-build44.us.oracle.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Wed Nov 20 23:56:07 PST 2013
Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC

 

I0110 21:19:58.796617 32064 impala-hs2-server.cc:219] TExecuteStatementReq: TExecuteStatementReq {
  01: sessionHandle (struct) = TSessionHandle {
    01: sessionId (struct) = THandleIdentifier {
      01: guid (string) = "\xc0\xf6SJ\xa3\xfcE\xd7\xba\xde\x1ea\x18\xfcq\xe5",
      02: secret (string) = "\x1c\xee\xc4\xd0(FL*\xa5*T\x9ek\xe2\x0ez",
    },
  },
  02: statement (string) = "select * from tab_separated",
  03: confOverlay (map) = map<string,string>[2] {
    "QUERY_TIMEOUT_S" -> "600",
    "impala.resultset.cache.size" -> "100000",
  },
  04: runAsync (bool) = true,
}
I0110 21:19:58.796835 32064 impala-hs2-server.cc:252] TClientRequest.queryOptions: TQueryOptions {
  01: abort_on_error (bool) = false,
  02: max_errors (i32) = 0,
  03: disable_codegen (bool) = false,
  04: batch_size (i32) = 0,
  05: num_nodes (i32) = 0,
  06: max_scan_range_length (i64) = 0,
  07: num_scanner_threads (i32) = 0,
  08: max_io_buffers (i32) = 0,
  09: allow_unsupported_formats (bool) = false,
  10: default_order_by_limit (i64) = -1,
  11: debug_action (string) = "",
  12: mem_limit (i64) = 0,
  13: abort_on_default_limit_exceeded (bool) = false,
  15: hbase_caching (i32) = 0,
  16: hbase_cache_blocks (bool) = false,
  17: parquet_file_size (i64) = 0,
  18: explain_level (i32) = 1,
  19: sync_ddl (bool) = false,
  23: disable_cached_reads (bool) = false,
  24: disable_outermost_topn (bool) = false,
  25: rm_initial_mem (i64) = 0,
  26: query_timeout_s (i32) = 600,
  28: appx_count_distinct (bool) = false,
  29: disable_unsafe_spills (bool) = false,
  31: exec_single_node_rows_threshold (i32) = 100,
  32: optimize_partition_key_scans (bool) = false,
  33: replica_preference (i32) = 0,
  34: schedule_random_replica (bool) = false,
  35: scan_node_codegen_threshold (i64) = 1800000,
  36: disable_streaming_preaggregations (bool) = false,
  37: runtime_filter_mode (i32) = 2,
  38: runtime_bloom_filter_size (i32) = 1048576,
  39: runtime_filter_wait_time_ms (i32) = 0,
  40: disable_row_runtime_filtering (bool) = false,
  41: max_num_runtime_filters (i32) = 10,
  42: parquet_annotate_strings_utf8 (bool) = false,
  43: parquet_fallback_schema_resolution (i32) = 0,
  45: s3_skip_insert_staging (bool) = true,
  46: runtime_filter_min_size (i32) = 1048576,
  47: runtime_filter_max_size (i32) = 16777216,
  48: prefetch_mode (i32) = 1,
  49: strict_mode (bool) = false,
  50: scratch_limit (i64) = -1,
  51: enable_expr_rewrites (bool) = true,
  52: decimal_v2 (bool) = false,
  53: parquet_array_resolution (i32) = 2,
}
I0110 21:19:58.801021 32064 Frontend.java:890] Compiling query: select * from tab_separated
I0110 21:19:58.801923 32064 Frontend.java:921] Compiled query.
I0110 21:19:58.805642 32064 coordinator.cc:441] Exec() query_id=d34914b27029b479:5514eb1e00000000 stmt=select * from tab_separated
I0110 21:19:58.806084 32064 coordinator.cc:592] starting 2 fragment instances for query d34914b27029b479:5514eb1e00000000
I0110 21:19:58.808128 36488 fragment-mgr.cc:40] ExecPlanFragment() instance_id=d34914b27029b479:5514eb1e00000000 coord=<##REDACTED##>:22000
I0110 21:19:58.808372 11220 plan-fragment-executor.cc:119] Prepare(): query_id=d34914b27029b479:5514eb1e00000000 instance_id=d34914b27029b479:5514eb1e00000000
I0110 21:19:58.808653 32064 coordinator.cc:630] started 2 fragment instances for query d34914b27029b479:5514eb1e00000000
I0110 21:19:58.808758 11220 plan-fragment-executor.cc:175] descriptor table for fragment=d34914b27029b479:5514eb1e00000000
tuples:
Tuple(id=0 size=46 slots=[Slot(id=0 type=BIGINT col_path=[0] offset=32 null=(offset=45 mask=4) slot_idx=2 field_idx=-1), Slot(id=1 type=STRING col_path=[1] offset=0 null=(offset=45 mask=1) slot_idx=0 field_idx=-1), Slot(id=2 type=INT col_path=[2] offset=40 null=(offset=45 mask=8) slot_idx=3 field_idx=-1), Slot(id=3 type=TIMESTAMP col_path=[3] offset=16 null=(offset=45 mask=2) slot_idx=1 field_idx=-1), Slot(id=4 type=BOOLEAN col_path=[4] offset=44 null=(offset=45 mask=10) slot_idx=4 field_idx=-1)] tuple_path=[])
I0110 21:19:58.866487 11220 plan-fragment-executor.cc:300] Open(): instance_id=d34914b27029b479:5514eb1e00000000
I0110 21:19:58.867712 32064 impala-server.cc:895] Query d34914b27029b479:5514eb1e00000000 has timeout of 10m
I0110 21:19:58.868438 32064 impala-hs2-server.cc:477] ExecuteStatement(): return_val=TExecuteStatementResp {
  01: status (struct) = TStatus {
    01: statusCode (i32) = 0,
  },
  02: operationHandle (struct) = TOperationHandle {
    01: operationId (struct) = THandleIdentifier {
      01: guid (string) = "y\xb4)p\xb2\x14I\xd3\x00\x00\x00\x00\x1e\xeb\x14U",
      02: secret (string) = "y\xb4)p\xb2\x14I\xd3\x00\x00\x00\x00\x1e\xeb\x14U",
    },
    02: operationType (i32) = 0,
    03: hasResultSet (bool) = true,
  },
}
I0110 21:19:59.821496 32330 status.cc:47] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 734654c895329e72:1 done: false)
    @           0x83758a  (unknown)
    @           0xaa5569  (unknown)
    @           0xd17be9  (unknown)
    @           0xd148b9  (unknown)
    @           0x80468c  (unknown)
    @           0x9f669f  (unknown)
    @           0x9f0cc9  (unknown)
    @           0x9f1722  (unknown)
    @           0xbd3d29  (unknown)
    @           0xbd4704  (unknown)
    @           0xe1db9a  (unknown)
    @       0x3efd207aa1  (unknown)
    @       0x3efcae893d  (unknown)
I0110 21:19:59.821506 32326 status.cc:47] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 5642f1f17845f4b7:2 done: false)
    @           0x83758a  (unknown)
    @           0xaa5569  (unknown)
    @           0xd17be9  (unknown)
    @           0xd148b9  (unknown)
    @           0x80468c  (unknown)
    @           0x9f669f  (unknown)
    @           0x9f0cc9  (unknown)
    @           0x9f1722  (unknown)
    @           0xbd3d29  (unknown)
    @           0xbd4704  (unknown)
    @           0xe1db9a  (unknown)
    @       0x3efd207aa1  (unknown)
    @       0x3efcae893d  (unknown)
I0110 21:19:59.821622 32330 impala-server.cc:1090] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 734654c895329e72:1 done: false)
I0110 21:19:59.821624 32326 impala-server.cc:1090] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 5642f1f17845f4b7:2 done: false)
I0110 21:19:59.821790 35533 plan-fragment-executor.cc:461] Cancelling fragment instance...
I0110 21:19:59.821801 35533 plan-fragment-executor.cc:475] Cancel(): instance_id=5642f1f17845f4b7:2
I0110 21:19:59.821811 35533 data-stream-mgr.cc:258] cancelling all streams for fragment=5642f1f17845f4b7:2
I0110 21:19:59.821802 34441 plan-fragment-executor.cc:461] Cancelling fragment instance...
I0110 21:19:59.821823 34441 plan-fragment-executor.cc:475] Cancel(): instance_id=734654c895329e72:1
I0110 21:19:59.821842 34441 data-stream-mgr.cc:258] cancelling all streams for fragment=734654c895329e72:1
I0110 21:20:00.028481 18960 status.cc:47] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: f442c4a5c9b4c8e8:1 done: false)
    @           0x83758a  (unknown)
    @           0xaa5569  (unknown)
    @           0xd17be9  (unknown)
    @           0xd148b9  (unknown)
    @           0x80468c  (unknown)
    @           0x9f669f  (unknown)
    @           0x9f0cc9  (unknown)
    @           0x9f1722  (unknown)
    @           0xbd3d29  (unknown)
    @           0xbd4704  (unknown)
    @           0xe1db9a  (unknown)
    @       0x3efd207aa1  (unknown)
    @       0x3efcae893d  (unknown)
I0110 21:20:00.028571 18960 impala-server.cc:1090] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: f442c4a5c9b4c8e8:1 done: false)
I0110 21:20:00.028717  2082 plan-fragment-executor.cc:461] Cancelling fragment instance...
I0110 21:20:00.028738  2082 plan-fragment-executor.cc:475] Cancel(): instance_id=f442c4a5c9b4c8e8:1
I0110 21:20:00.028754  2082 data-stream-mgr.cc:258] cancelling all streams for fragment=f442c4a5c9b4c8e8:1

 

 

avatar
Explorer
I don't think it's a timeout issue because there's only 1 record in this test

avatar
Explorer

I probably should mention that the query is initiated from Hue UI on a node called xxx-02.mycompany.com and the error is mentioning not being able to use xxx-01.mycompany.com or xxx-03.mycompany.com -- all are parts of the same cluster.

 

And if I change "lines terminated by" to \n or \001 then everything works fine. Hmm, maybe I'll use \001 as a workaround for now! Maybe \0 on Linux means something different (like an abrupt EOF) when Impala reads text files...

avatar

@1stSolothanks for the info, I'll look into it further. It definitely looks like a bug causing an Impala crash so I want to get to the bottom of it. Your workaround of using a different terminator should work.

avatar

I was able to reproduce it myself on several versions of CDH. I filed a bug report to track it: https://issues.apache.org/jira/browse/IMPALA-6389 . Thank you very much for letting us know about this.

avatar
Explorer
Thank you. I think it's worth mentioning in the ticket/bug report that if, instead of "lines terminated by '\000'", another character is used, e.g. \001 or \n then there's no crash.

avatar

That is a good suggestion, I went ahead and did it.