Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Impala: "Cancelled due to unreachable impalad(s)" when using \0 as line separator in text file/table

avatar
Visitor

Whenever using \0 as a new line separator in the textfile-based Impala table, it seems to crash. Please help.

 

To reproduce:

 

create table tab_separated(id bigint, s string, n int, t timestamp, b boolean)
  row format delimited
  fields terminated by '\t' escaped by '\\' lines terminated by '\000'
  stored as textfile;

-- Success.

 

select * from tab_separated; -- Done. 0 results.

 

insert into tab_separated (id, s) values (100, ''); -- Success.

 

select * from tab_separated; -- 20 second delay before getting "Cancelled due to unreachable impalad(s): xxxx:22000"

 

 

2 ACCEPTED SOLUTIONS

avatar
Visitor

I probably should mention that the query is initiated from Hue UI on a node called xxx-02.mycompany.com and the error is mentioning not being able to use xxx-01.mycompany.com or xxx-03.mycompany.com -- all are parts of the same cluster.

 

And if I change "lines terminated by" to \n or \001 then everything works fine. Hmm, maybe I'll use \001 as a workaround for now! Maybe \0 on Linux means something different (like an abrupt EOF) when Impala reads text files...

View solution in original post

avatar

I was able to reproduce it myself on several versions of CDH. I filed a bug report to track it: https://issues.apache.org/jira/browse/IMPALA-6389 . Thank you very much for letting us know about this.

View solution in original post

8 REPLIES 8

avatar

Thanks for letting us know about this and the clear steps.

 

I wasn't able to reproduce the exact behaviour on my development version of Impala. What version of Impala are you seeing this in so that I can try to reproduce what you're seeing?

avatar
Visitor

Hope this helps.

 

impalad version 2.8.0-cdh5.11.0 RELEASE (build e09660de6b503a15f07e84b99b63e8e745854c34)
Built on Wed Apr  5 19:51:24 PDT 2017

 

Cpu Info:
  Model: Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
  Cores: 40
  L1 Cache: 32.00 KB (Line: 64.00 B)
  L2 Cache: 256.00 KB (Line: 64.00 B)
  L3 Cache: 25.00 MB (Line: 64.00 B)
  Hardware Supports:
    ssse3
    sse4_1
    sse4_2
    popcnt
    avx
    avx2
 Physical Memory: 251.99 GB
 Disk Info: 
  Num disks 14: 
    sdm (rotational=true)
    sda (rotational=true)
    sdb (rotational=true)
    sdd (rotational=true)
    sdc (rotational=true)
    sde (rotational=true)
    sdg (rotational=true)
    sdf (rotational=true)
    sdk (rotational=true)
    sdj (rotational=true)
    sdi (rotational=true)
    sdh (rotational=true)
    sdl (rotational=true)
    dm- (rotational=true)

 

OS version: Linux version 2.6.32-431.el6.x86_64 (mockbuild@ca-build44.us.oracle.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Wed Nov 20 23:56:07 PST 2013
Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC

 

I0110 21:19:58.796617 32064 impala-hs2-server.cc:219] TExecuteStatementReq: TExecuteStatementReq {
  01: sessionHandle (struct) = TSessionHandle {
    01: sessionId (struct) = THandleIdentifier {
      01: guid (string) = "\xc0\xf6SJ\xa3\xfcE\xd7\xba\xde\x1ea\x18\xfcq\xe5",
      02: secret (string) = "\x1c\xee\xc4\xd0(FL*\xa5*T\x9ek\xe2\x0ez",
    },
  },
  02: statement (string) = "select * from tab_separated",
  03: confOverlay (map) = map<string,string>[2] {
    "QUERY_TIMEOUT_S" -> "600",
    "impala.resultset.cache.size" -> "100000",
  },
  04: runAsync (bool) = true,
}
I0110 21:19:58.796835 32064 impala-hs2-server.cc:252] TClientRequest.queryOptions: TQueryOptions {
  01: abort_on_error (bool) = false,
  02: max_errors (i32) = 0,
  03: disable_codegen (bool) = false,
  04: batch_size (i32) = 0,
  05: num_nodes (i32) = 0,
  06: max_scan_range_length (i64) = 0,
  07: num_scanner_threads (i32) = 0,
  08: max_io_buffers (i32) = 0,
  09: allow_unsupported_formats (bool) = false,
  10: default_order_by_limit (i64) = -1,
  11: debug_action (string) = "",
  12: mem_limit (i64) = 0,
  13: abort_on_default_limit_exceeded (bool) = false,
  15: hbase_caching (i32) = 0,
  16: hbase_cache_blocks (bool) = false,
  17: parquet_file_size (i64) = 0,
  18: explain_level (i32) = 1,
  19: sync_ddl (bool) = false,
  23: disable_cached_reads (bool) = false,
  24: disable_outermost_topn (bool) = false,
  25: rm_initial_mem (i64) = 0,
  26: query_timeout_s (i32) = 600,
  28: appx_count_distinct (bool) = false,
  29: disable_unsafe_spills (bool) = false,
  31: exec_single_node_rows_threshold (i32) = 100,
  32: optimize_partition_key_scans (bool) = false,
  33: replica_preference (i32) = 0,
  34: schedule_random_replica (bool) = false,
  35: scan_node_codegen_threshold (i64) = 1800000,
  36: disable_streaming_preaggregations (bool) = false,
  37: runtime_filter_mode (i32) = 2,
  38: runtime_bloom_filter_size (i32) = 1048576,
  39: runtime_filter_wait_time_ms (i32) = 0,
  40: disable_row_runtime_filtering (bool) = false,
  41: max_num_runtime_filters (i32) = 10,
  42: parquet_annotate_strings_utf8 (bool) = false,
  43: parquet_fallback_schema_resolution (i32) = 0,
  45: s3_skip_insert_staging (bool) = true,
  46: runtime_filter_min_size (i32) = 1048576,
  47: runtime_filter_max_size (i32) = 16777216,
  48: prefetch_mode (i32) = 1,
  49: strict_mode (bool) = false,
  50: scratch_limit (i64) = -1,
  51: enable_expr_rewrites (bool) = true,
  52: decimal_v2 (bool) = false,
  53: parquet_array_resolution (i32) = 2,
}
I0110 21:19:58.801021 32064 Frontend.java:890] Compiling query: select * from tab_separated
I0110 21:19:58.801923 32064 Frontend.java:921] Compiled query.
I0110 21:19:58.805642 32064 coordinator.cc:441] Exec() query_id=d34914b27029b479:5514eb1e00000000 stmt=select * from tab_separated
I0110 21:19:58.806084 32064 coordinator.cc:592] starting 2 fragment instances for query d34914b27029b479:5514eb1e00000000
I0110 21:19:58.808128 36488 fragment-mgr.cc:40] ExecPlanFragment() instance_id=d34914b27029b479:5514eb1e00000000 coord=<##REDACTED##>:22000
I0110 21:19:58.808372 11220 plan-fragment-executor.cc:119] Prepare(): query_id=d34914b27029b479:5514eb1e00000000 instance_id=d34914b27029b479:5514eb1e00000000
I0110 21:19:58.808653 32064 coordinator.cc:630] started 2 fragment instances for query d34914b27029b479:5514eb1e00000000
I0110 21:19:58.808758 11220 plan-fragment-executor.cc:175] descriptor table for fragment=d34914b27029b479:5514eb1e00000000
tuples:
Tuple(id=0 size=46 slots=[Slot(id=0 type=BIGINT col_path=[0] offset=32 null=(offset=45 mask=4) slot_idx=2 field_idx=-1), Slot(id=1 type=STRING col_path=[1] offset=0 null=(offset=45 mask=1) slot_idx=0 field_idx=-1), Slot(id=2 type=INT col_path=[2] offset=40 null=(offset=45 mask=8) slot_idx=3 field_idx=-1), Slot(id=3 type=TIMESTAMP col_path=[3] offset=16 null=(offset=45 mask=2) slot_idx=1 field_idx=-1), Slot(id=4 type=BOOLEAN col_path=[4] offset=44 null=(offset=45 mask=10) slot_idx=4 field_idx=-1)] tuple_path=[])
I0110 21:19:58.866487 11220 plan-fragment-executor.cc:300] Open(): instance_id=d34914b27029b479:5514eb1e00000000
I0110 21:19:58.867712 32064 impala-server.cc:895] Query d34914b27029b479:5514eb1e00000000 has timeout of 10m
I0110 21:19:58.868438 32064 impala-hs2-server.cc:477] ExecuteStatement(): return_val=TExecuteStatementResp {
  01: status (struct) = TStatus {
    01: statusCode (i32) = 0,
  },
  02: operationHandle (struct) = TOperationHandle {
    01: operationId (struct) = THandleIdentifier {
      01: guid (string) = "y\xb4)p\xb2\x14I\xd3\x00\x00\x00\x00\x1e\xeb\x14U",
      02: secret (string) = "y\xb4)p\xb2\x14I\xd3\x00\x00\x00\x00\x1e\xeb\x14U",
    },
    02: operationType (i32) = 0,
    03: hasResultSet (bool) = true,
  },
}
I0110 21:19:59.821496 32330 status.cc:47] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 734654c895329e72:1 done: false)
    @           0x83758a  (unknown)
    @           0xaa5569  (unknown)
    @           0xd17be9  (unknown)
    @           0xd148b9  (unknown)
    @           0x80468c  (unknown)
    @           0x9f669f  (unknown)
    @           0x9f0cc9  (unknown)
    @           0x9f1722  (unknown)
    @           0xbd3d29  (unknown)
    @           0xbd4704  (unknown)
    @           0xe1db9a  (unknown)
    @       0x3efd207aa1  (unknown)
    @       0x3efcae893d  (unknown)
I0110 21:19:59.821506 32326 status.cc:47] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 5642f1f17845f4b7:2 done: false)
    @           0x83758a  (unknown)
    @           0xaa5569  (unknown)
    @           0xd17be9  (unknown)
    @           0xd148b9  (unknown)
    @           0x80468c  (unknown)
    @           0x9f669f  (unknown)
    @           0x9f0cc9  (unknown)
    @           0x9f1722  (unknown)
    @           0xbd3d29  (unknown)
    @           0xbd4704  (unknown)
    @           0xe1db9a  (unknown)
    @       0x3efd207aa1  (unknown)
    @       0x3efcae893d  (unknown)
I0110 21:19:59.821622 32330 impala-server.cc:1090] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 734654c895329e72:1 done: false)
I0110 21:19:59.821624 32326 impala-server.cc:1090] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: 5642f1f17845f4b7:2 done: false)
I0110 21:19:59.821790 35533 plan-fragment-executor.cc:461] Cancelling fragment instance...
I0110 21:19:59.821801 35533 plan-fragment-executor.cc:475] Cancel(): instance_id=5642f1f17845f4b7:2
I0110 21:19:59.821811 35533 data-stream-mgr.cc:258] cancelling all streams for fragment=5642f1f17845f4b7:2
I0110 21:19:59.821802 34441 plan-fragment-executor.cc:461] Cancelling fragment instance...
I0110 21:19:59.821823 34441 plan-fragment-executor.cc:475] Cancel(): instance_id=734654c895329e72:1
I0110 21:19:59.821842 34441 data-stream-mgr.cc:258] cancelling all streams for fragment=734654c895329e72:1
I0110 21:20:00.028481 18960 status.cc:47] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: f442c4a5c9b4c8e8:1 done: false)
    @           0x83758a  (unknown)
    @           0xaa5569  (unknown)
    @           0xd17be9  (unknown)
    @           0xd148b9  (unknown)
    @           0x80468c  (unknown)
    @           0x9f669f  (unknown)
    @           0x9f0cc9  (unknown)
    @           0x9f1722  (unknown)
    @           0xbd3d29  (unknown)
    @           0xbd4704  (unknown)
    @           0xe1db9a  (unknown)
    @       0x3efd207aa1  (unknown)
    @       0x3efcae893d  (unknown)
I0110 21:20:00.028571 18960 impala-server.cc:1090] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled). (instance: f442c4a5c9b4c8e8:1 done: false)
I0110 21:20:00.028717  2082 plan-fragment-executor.cc:461] Cancelling fragment instance...
I0110 21:20:00.028738  2082 plan-fragment-executor.cc:475] Cancel(): instance_id=f442c4a5c9b4c8e8:1
I0110 21:20:00.028754  2082 data-stream-mgr.cc:258] cancelling all streams for fragment=f442c4a5c9b4c8e8:1

 

 

avatar
Visitor
I don't think it's a timeout issue because there's only 1 record in this test

avatar
Visitor

I probably should mention that the query is initiated from Hue UI on a node called xxx-02.mycompany.com and the error is mentioning not being able to use xxx-01.mycompany.com or xxx-03.mycompany.com -- all are parts of the same cluster.

 

And if I change "lines terminated by" to \n or \001 then everything works fine. Hmm, maybe I'll use \001 as a workaround for now! Maybe \0 on Linux means something different (like an abrupt EOF) when Impala reads text files...

avatar

@1stSolothanks for the info, I'll look into it further. It definitely looks like a bug causing an Impala crash so I want to get to the bottom of it. Your workaround of using a different terminator should work.

avatar

I was able to reproduce it myself on several versions of CDH. I filed a bug report to track it: https://issues.apache.org/jira/browse/IMPALA-6389 . Thank you very much for letting us know about this.

avatar
Visitor
Thank you. I think it's worth mentioning in the ticket/bug report that if, instead of "lines terminated by '\000'", another character is used, e.g. \001 or \n then there's no crash.

avatar

That is a good suggestion, I went ahead and did it.