Support Questions
Find answers, ask questions, and share your expertise

ImpalaRuntimeException: Unable to initialize the Kudu scan node

Explorer

Hi Cloudera gurús,

This is my CDP.cdp_env.png

3 Master Nodes+3 Worker Nodes

HA enabled and testing it.

Here is the issue: when I shut down Master 2 some queries are randomly failing showing this:

# impala-shell -i haproxy-server.com -q "use dbschema; select * from table_foo limit 10;"
Starting Impala Shell without Kerberos authentication
Warning: live_progress only applies to interactive shell sessions, and is being skipped for now.
Opened TCP connection to haproxy-server.com:21000
Connected to haproxy-server.com:21000
Server version: impalad version 3.4.0-SNAPSHOT RELEASE (build 0cadcf7ac76ecec87d9786048db3672c37d41c6f)
Query: use dbschema
Query: select * from table_foo limit 10
Query submitted at: 2022-03-23 11:28:37 (Coordinator: http://worker1:25000)
ERROR: ImpalaRuntimeException: Unable to initialize the Kudu scan node
CAUSED BY: AnalysisException: Unable to open the Kudu table: dbschema.table_foo
CAUSED BY: NonRecoverableException: cannot complete before timeout: KuduRpc(method=GetTableSchema, tablet=Kudu Master, attempt=1, TimeoutTracker(timeout=180000, elapsed=180004), Trace Summary(0 ms): Sent(1), Received(0), Delayed(0), MasterRefresh(0), AuthRefresh(0), Truncated: false
Sent: (master-192.168.1.10:7051, [ GetTableSchema, 1 ]))

Could not execute command: select * from table_foo limit 10

The thing is that all leaders are correctly re-balanced to other nodes and something is working, because most queries are working.

Does someone have any clue? I was thinking about Hive server but not sure how to trace it.

Note: as CM is in Master2, this is unavailable (this is not affecting, some different tests have been done having CM out of service and queries were working fine)

Note2: does it affects that the kudu Master were in Master2?

Many thanks in advance for your help.

 

Best Regards

3 REPLIES 3

Rising Star

Hello @Juanes ,

 

Could you please check the 

ksck report 

ksck report from kudu, Please if you have any unhealthy tables also verify the replicas as well.

Please refer doc[1]

 

doc[1]: 

https://kudu.apache.org/docs/administration.html#tablet_majority_down_recovery

Thanks, 

Explorer

Hello ,

the ksck is showing that tables are OK (Recovering | Under-replicated | Unavailable are all = 0)

W0324 12:15:41.325619 18080 negotiation.cc:313] Failed RPC negotiation. Trace:

0324 12:15:40.627405 (+     0us) reactor.cc:609] Submitting negotiation task for client connection to master2:7051
0324 12:15:40.627616 (+   211us) negotiation.cc:98] Waiting for socket to connect
0324 12:15:41.325243 (+697627us) negotiation.cc:304] Negotiation complete: Network error: Client connection negotiation failed: client connection to master2:7051: connect: No route to host (error 113)
Metrics: {"client-negotiator.queue_time_us":187,"thread_start_us":157,"threads_started":1}
W0324 12:15:44.329090 18080 negotiation.cc:313] Failed RPC negotiation. Trace:
0324 12:15:41.325954 (+     0us) reactor.cc:609] Submitting negotiation task for client connection to master2:7051
0324 12:15:41.326027 (+    73us) negotiation.cc:98] Waiting for socket to connect
0324 12:15:44.329031 (+3003004us) negotiation.cc:304] Negotiation complete: Timed out: Client connection negotiation failed: client connection to master2:7051: Timeout exceeded waiting to connect
Metrics: {"client-negotiator.queue_time_us":38}
W0324 12:15:44.331089 18080 negotiation.cc:313] Failed RPC negotiation. Trace:
0324 12:15:44.329518 (+     0us) reactor.cc:609] Submitting negotiation task for client connection to master2:7051
0324 12:15:44.329580 (+    62us) negotiation.cc:98] Waiting for socket to connect
0324 12:15:44.331065 (+  1485us) negotiation.cc:304] Negotiation complete: Network error: Client connection negotiation failed: client connection to master2:7051: connect: No route to host (error 113)
Metrics: {"client-negotiator.queue_time_us":36}
Master Summary
                  UUID                  |          Address           |   Status
----------------------------------------+----------------------------+-------------
 8a168b68a6dd487c952419672ba32088       | master3.server.com | HEALTHY
 daa9129e78244be2aaa7e5e649cc1dc8       | master1.server.com | HEALTHY
 <unknown> (master2.server.com) | master2.server.com | UNAVAILABLE
Error from master2.server.com: Network error: Client connection negotiation failed: client connection to master2:7051: connect: No route to host (error 113) (UNAVAILABLE)
All reported replicas are:
  A = daa9129e78244be2aaa7e5e649cc1dc8
  B = <unknown> (master2.server.com)
  C = 8a168b68a6dd487c952419672ba32088
  D = 1f02c618009c44d381c55841dcb5a498
The consensus matrix is:
 Config source |        Replicas        | Current term | Config index | Committed?
---------------+------------------------+--------------+--------------+------------
 A             | A*      C   D          | 88           | -1           | Yes
 B             | [config not available] |              |              |
 C             | A*      C   D          | 88           | -1           | Yes
Flags of checked categories for Master:
        Flag         |                            Value                            |                         Master
---------------------+-------------------------------------------------------------+--------------------------------------------------------
 builtin_ntp_servers | 0.pool.ntp.org,1.pool.ntp.org,2.pool.ntp.org,3.pool.ntp.org | master1.server.com, master3.server.com
 time_source         | system                                                      | master1.server.com, master3.server.com
Tablet Server Summary
               UUID               |             Address             | Status  | Location | Tablet Leaders | Active Scanners
----------------------------------+---------------------------------+---------+----------+----------------+-----------------
 0dcf7da9a99c43dd99f956a95abb773d | worker3.server.com:7050 | HEALTHY | /default |      85        |       0
 345166476e6440b4a957d76dd1de947e | worker2.server.com:7050 | HEALTHY | /default |     314        |       0
 9e30289cb8d244e0a7fb822897fc7c29 | worker1.server.com:7050 | HEALTHY | /default |    1051        |       0
Tablet Server Location Summary
 Location |  Count
----------+---------
 /default |       3

Flags of checked categories for Tablet Server:
        Flag         |                            Value                            |      Tablet Server
---------------------+-------------------------------------------------------------+-------------------------
 builtin_ntp_servers | 0.pool.ntp.org,1.pool.ntp.org,2.pool.ntp.org,3.pool.ntp.org | all 3 server(s) checked
 time_source         | system                                                      | all 3 server(s) checked

Version Summary
      Version       |                                                               Servers
--------------------+--------------------------------------------------------------------------------------------------------------------------------------
 1.13.0.7.1.6.0-297 | master@master3.server.com, master@master1.server.com, tserver@worker3.server.com:7050, and 2 other server(s)

Tablet Summary
Summary by table
                                  Name                                   | RF | Status  | Total Tablets | Healthy | Recovering | Under-replicated | Unavailable
-------------------------------------------------------------------------+----+---------+---------------+---------+------------+------------------+-------------

Tablet Replica Count Summary
Statistic | Replica Count
----------------+---------------
Minimum | 1450
First Quartile | 1450
Median | 1450
Third Quartile | 1450
Maximum | 1450

Total Count Summary
| Total Count
----------------+-------------
Masters | 3
Tablet Servers | 3
Tables | 109
Tablets | 1450
Replicas | 4350

==================
Warnings:
==================
master unusual flags check error: 1 of 3 masters were not available to retrieve unusual flags
master diverged flags check error: 1 of 3 masters were not available to retrieve time_source category flags

==================
Errors:
==================
Network error: error fetching info from masters: failed to gather info from all masters: 1 of 3 had errors
Corruption: master consensus error: there are master consensus conflicts

 

That I have no clear is why I'm having a consensus error if I have 2of3 Master UP and all 3 Tablet servers UP

 

 
 Many thanks for your help.

Explorer

Hello,

does anyone knows if exists any table reference with the errors?

Just wanted to know what it means : Unable to initialize the Kudu scan node

no relevant traces found in the following logs:

Impala Daemon

Impala Catalog Server

Impala State Store

Kudu Master Leader

Kudu tablet

Hive Metastore

Hive Server2

 

I'm getting out of resources 😞

 

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.