Member since
02-18-2019
83
Posts
3
Kudos Received
0
Solutions
11-24-2019
11:29 PM
Hello All,
We are observing below error on our Kudu Master server,
CDH / CM = 5.16.2 Kudu 1.7.0 memory_limit_hard_bytes = 35GB
W1125 14:29:26.059293 39728 service_pool.cc:130] GetTableSchema request on kudu.master.MasterService from KUDU MASTER IP:60098 dropped due to backpressure. The service queue is full; it has 50 items.
W1125 16:12:31.944316 14678 negotiation.cc:313] Failed RPC negotiation. Trace:
1125 16:12:28.943331 (+ 0us) reactor.cc:577] Submitting negotiation task for server connection from 172.12.345.678:33960
1125 16:12:28.943495 (+ 164us) server_negotiation.cc:176] Beginning negotiation
1125 16:12:28.943498 (+ 3us) server_negotiation.cc:365] Waiting for connection header
1125 16:12:28.944585 (+ 1087us) server_negotiation.cc:373] Connection header received
1125 16:12:31.944237 (+2999652us) negotiation.cc:304] Negotiation complete: Timed out: Server connection negotiation failed: server connection from 172.12.345.678:33960
Metrics: {"server-negotiator.queue_time_us":120,"thread_start_us":54,"threads_started":1}
W1125 16:32:01.273672 26481 negotiation.cc:313] Failed RPC negotiation. Trace:
1125 16:31:58.273342 (+ 0us) reactor.cc:577] Submitting negotiation task for server connection from 172.12.345.678:34488
1125 16:31:58.273518 (+ 176us) server_negotiation.cc:176] Beginning negotiation
1125 16:31:58.273521 (+ 3us) server_negotiation.cc:365] Waiting for connection header
1125 16:31:58.274365 (+ 844us) server_negotiation.cc:373] Connection header received
1125 16:32:01.273565 (+2999200us) negotiation.cc:304] Negotiation complete: Timed out: Server connection negotiation failed: server connection from 172.12.345.678:34488
Metrics: {"server-negotiator.queue_time_us":134,"thread_start_us":59,"threads_started":1}
W1125 16:42:01.186538 39726 service_pool.cc:130] GetTableSchema request on kudu.master.MasterService from TSERVER1:49284 dropped due to backpressure. The service queue is full; it has 50 items.
W1125 16:46:10.786643 39729 service_pool.cc:130] GetTableSchema request on kudu.master.MasterService from TSERVER2:34404 dropped due to backpressure. The service queue is full; it has 50 items.
W1125 17:40:18.836066 39726 service_pool.cc:130] GetTableSchema request on kudu.master.MasterService from TSERVER3:43694 dropped due to backpressure. The service queue is full; it has 50 items.
Request some assistance in resolving / fixing this issue
Regards
Amn
... View more
Labels:
- Labels:
-
Apache Kudu
-
Cloudera Manager
11-20-2019
10:22 PM
Hello, We have been receiving alerts for Impala [The health test result for IMPALAD_QUERY_MONITORING_STATUS has become bad: There are 0 error(s) seen monitoring executing queries, and 1 errors(s) seen monitoring completed queries for this role in the previous 5 minute(s). Critical threshold: any.] I was following Cloudera Article ( https://my.cloudera.com/knowledge/Impala--IMPALADQUERYMONITORINGSTATUS-has-become-bad?id=75434) but am unsure if this would apply to our environment (our is CDH 5.16.2 / CH 5.16.2), also the 2 Apache Jira mentioned in this Article is resolved. Below is the log from one of our Impala daemon. I have also check and there are no failed queries (screenshot attached) Current Query Monitoring Timeout = 1 minute / Query Monitoring Period = 1 minute. Increasing the values as mentioned in the article would postpone the alerts, would like to know if there is a fix for this issue. Impalad Logs [root@NTPDQ cloudera-scm-agent]# tail -n 100 cloudera-scm-agent.log
result, stdout, stderr = self._subprocess_with_timeout(args, self._timeout)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/monitor/host/ntp_monitor.py", line 38, in _subprocess_with_timeout
return subprocess_with_timeout(args, timeout)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/subprocess_timeout.py", line 94, in subprocess_with_timeout
raise Exception("timeout with args %s" % args)
Exception: timeout with args ['ntpq', '-np']
[21/Nov/2019 17:01:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 6.90371108055
[21/Nov/2019 17:01:29 +0000] 5415 MainThread throttling_logger INFO (14 skipped) Identified java component java8 with full version java version "1.8.0_121" Java(TM) SE Runtime Environment (build 1.8.0_121-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode) for requested version .
[21/Nov/2019 17:03:18 +0000] 5415 MainThread agent WARNING Long HB processing time: 5.24657082558
[21/Nov/2019 17:05:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 6.19837498665
[21/Nov/2019 17:06:18 +0000] 5415 MainThread agent WARNING Long HB processing time: 5.16436910629
[21/Nov/2019 17:07:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 6.74405503273
[21/Nov/2019 17:07:29 +0000] 5415 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.00 min:0.06 mean:0.10 max:0.25 LIFE_MAX:45.16
[21/Nov/2019 17:10:24 +0000] 5415 MainThread agent WARNING Long HB processing time: 10.0939228535
[21/Nov/2019 17:11:21 +0000] 5415 MainThread agent WARNING Long HB processing time: 7.39454698563
[21/Nov/2019 17:12:21 +0000] 5415 MainThread agent WARNING Long HB processing time: 7.03050684929
[21/Nov/2019 17:13:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 5.31969809532
[21/Nov/2019 17:14:21 +0000] 5415 MainThread agent WARNING Long HB processing time: 7.35476112366
[21/Nov/2019 17:15:27 +0000] 5415 MainThread agent WARNING Long HB processing time: 13.7484540939
[21/Nov/2019 17:16:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 6.23533391953
[21/Nov/2019 17:17:22 +0000] 5415 MainThread agent WARNING Long HB processing time: 23.2213561535
[21/Nov/2019 17:17:22 +0000] 5415 MainThread agent WARNING Delayed HB: 8s since last
[21/Nov/2019 17:17:37 +0000] 5415 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.00 min:0.05 mean:0.09 max:0.95 LIFE_MAX:45.16
[21/Nov/2019 17:18:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 11.9228010178
[21/Nov/2019 17:19:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 12.488642931
[21/Nov/2019 17:20:27 +0000] 5415 MainThread agent WARNING Long HB processing time: 20.2072188854
[21/Nov/2019 17:20:27 +0000] 5415 MainThread agent WARNING Delayed HB: 5s since last
[21/Nov/2019 17:21:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 7.10305094719
[21/Nov/2019 17:22:22 +0000] 5415 MainThread agent WARNING Long HB processing time: 9.03495693207
[21/Nov/2019 17:23:00 +0000] 5415 MonitorDaemon-Reporter throttling_logger INFO (59 skipped) Descendants user CPU lower than expected for process 29048: 1419823.60001, 245602.28
[21/Nov/2019 17:23:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 7.17436099052
[21/Nov/2019 17:24:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 6.09477305412
[21/Nov/2019 17:25:22 +0000] 5415 MainThread agent WARNING Long HB processing time: 8.74135708809
[21/Nov/2019 17:26:22 +0000] 5415 MainThread agent WARNING Long HB processing time: 8.5287861824
[21/Nov/2019 17:27:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 6.24944400787
[21/Nov/2019 17:27:44 +0000] 5415 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.00 min:0.06 mean:0.07 max:0.11 LIFE_MAX:45.16
[21/Nov/2019 17:29:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 6.15325498581
[21/Nov/2019 17:30:22 +0000] 5415 MainThread agent WARNING Long HB processing time: 8.27527880669
[21/Nov/2019 17:31:17 +0000] 5415 MainThread agent WARNING Long HB processing time: 18.4421730042
[21/Nov/2019 17:31:17 +0000] 5415 MainThread agent WARNING Delayed HB: 3s since last
[21/Nov/2019 17:31:24 +0000] 5415 MainThread agent WARNING Long HB processing time: 7.10764193535
[21/Nov/2019 17:32:22 +0000] 5415 MainThread throttling_logger INFO (14 skipped) Identified java component java8 with full version java version "1.8.0_121" Java(TM) SE Runtime Environment (build 1.8.0_121-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode) for requested version .
[21/Nov/2019 17:32:22 +0000] 5415 MainThread agent WARNING Long HB processing time: 19.3404970169
[21/Nov/2019 17:32:22 +0000] 5415 MainThread agent WARNING Delayed HB: 4s since last
[21/Nov/2019 17:33:21 +0000] 5415 MainThread agent WARNING Long HB processing time: 14.3549408913
[21/Nov/2019 17:34:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 13.6162810326
[21/Nov/2019 17:35:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 13.1593289375
[21/Nov/2019 17:36:21 +0000] 5415 MainThread agent WARNING Long HB processing time: 13.8171610832
[21/Nov/2019 17:37:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 12.1159389019
[21/Nov/2019 17:37:52 +0000] 5415 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.00 min:0.06 mean:0.09 max:1.04 LIFE_MAX:45.16
[21/Nov/2019 17:38:21 +0000] 5415 MainThread agent WARNING Long HB processing time: 13.9521288872
[21/Nov/2019 17:39:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 12.0477230549
[21/Nov/2019 17:39:58 +0000] 5415 Monitor-HostMonitor throttling_logger ERROR (2 skipped) Timeout with args ['ntpq', '-np']
None
[21/Nov/2019 17:39:58 +0000] 5415 Monitor-HostMonitor throttling_logger ERROR (2 skipped) Failed to collect NTP metrics
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/monitor/host/ntp_monitor.py", line 48, in collect
self.collect_ntpd()
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/monitor/host/ntp_monitor.py", line 66, in collect_ntpd
result, stdout, stderr = self._subprocess_with_timeout(args, self._timeout)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/monitor/host/ntp_monitor.py", line 38, in _subprocess_with_timeout
return subprocess_with_timeout(args, timeout)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/subprocess_timeout.py", line 94, in subprocess_with_timeout
raise Exception("timeout with args %s" % args)
Exception: timeout with args ['ntpq', '-np']
[21/Nov/2019 17:40:29 +0000] 5415 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 901, in _get_completed_query_profiles
self._query_monitor.get_completed_queries(query_log_file)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 581, in get_completed_queries
completed_query_report_limit)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 456, in get_completed_queries
last_accessed_file_timestamp)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 265, in _get_completed_queries
for event in streamer.stream():
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/ClusterStatsLogStreaming-UNKNOWN-py2.7.egg/clusterstats/log/streaming/event_streamer.py", line 134, in stream
file_list = self.__sort_file_list(self.__filtered_file_list, ascending)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/ClusterStatsLogStreaming-UNKNOWN-py2.7.egg/clusterstats/log/streaming/event_streamer.py", line 162, in __sort_file_list
handle = f.open_log_file()
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/ClusterStatsLogStreaming-UNKNOWN-py2.7.egg/clusterstats/log/streaming/file.py", line 41, in open_log_file
handle = open(self.__path)
IOError: [Errno 2] No such file or directory: '/var/log/impalad/profiles/impala_profile_log_1.1-1574308887326'
[21/Nov/2019 17:40:29 +0000] 5415 MainThread agent WARNING Long HB processing time: 22.2102739811
[21/Nov/2019 17:40:29 +0000] 5415 MainThread agent WARNING Delayed HB: 7s since last
[21/Nov/2019 17:41:24 +0000] 5415 MainThread agent WARNING Long HB processing time: 25.2082297802
[21/Nov/2019 17:41:24 +0000] 5415 MainThread agent WARNING Delayed HB: 10s since last
[21/Nov/2019 17:42:23 +0000] 5415 MainThread agent WARNING Long HB processing time: 13.1406769753
[21/Nov/2019 17:43:21 +0000] 5415 MainThread agent WARNING Long HB processing time: 11.081389904
[21/Nov/2019 17:44:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 10.6936099529
[21/Nov/2019 17:45:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 9.0398440361
[21/Nov/2019 17:46:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 9.3374710083
[21/Nov/2019 17:47:18 +0000] 5415 MainThread agent WARNING Long HB processing time: 8.1221549511
[21/Nov/2019 17:47:55 +0000] 5415 MainThread heartbeat_tracker INFO HB stats (seconds): num:39 LIFE_MIN:0.00 min:0.06 mean:0.08 max:0.24 LIFE_MAX:45.16
[21/Nov/2019 17:48:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 9.35701394081
[21/Nov/2019 17:49:17 +0000] 5415 MainThread agent WARNING Long HB processing time: 7.75842308998
[21/Nov/2019 17:50:23 +0000] 5415 MainThread agent WARNING Long HB processing time: 13.1954960823
[21/Nov/2019 17:51:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 9.66306400299
[21/Nov/2019 17:52:20 +0000] 5415 MainThread agent WARNING Long HB processing time: 9.91945910454
[21/Nov/2019 17:53:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 9.45511198044
[21/Nov/2019 17:54:19 +0000] 5415 MainThread agent WARNING Long HB processing time: 9.34867501259 Thanks Amn
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Manager
11-18-2019
12:51 AM
I see all my Tablet Servers Healthy and the Summary by Table also shows them Healthy. Nothing in 'Recovering / Under-Replicated / Unavailable'
... View more
11-17-2019
11:42 PM
@awong Checked all 9 tablet servers all are done bootstrapping, I see the same results as I posted in the previous screenshot although the numbers are different, but its all at 100% Regards Amn
... View more
11-17-2019
11:29 PM
Hi Awong, Thanks for the quick reply, following is what I see, based on the screenshot what I understand is that the bootstrap process is completed, as it says 100%, also when I click on Details > toggle I see Under Last Status as either -Bootstrap complete. or No bootstrap required, opened a new log, for the corresponding Table Name. When I check the logs I still see the same previous error. anything else I can check?? Regards Amn
... View more
11-17-2019
11:01 PM
Hi All, I am getting the below error on one of my Kudu tablet server, I have restarted table server services on this host yet when I check them I continue to get this error W1118 19:43:31.815698 33067 consensus_peers.cc:435] T 3292e490cf4843d994a45f9a4c7782c0 P cc36320dd81646d081a24203751c2a6a -> Peer 164c8bcafccc4fd0adfb6dfe7a2ff60e (MYSERVER.com:7050): Couldn't send request to peer 164c8bcafccc4fd0adfb6dfe7a2ff60e for tablet 3292e490cf4843d994a45f9a4c7782c0. Error code: TABLET_NOT_RUNNING (12). Status: Illegal state: Tablet not RUNNING: INITIALIZED. Retrying in the next heartbeat period. Already tried 389 times. Any help is much appreciated Regards Amn
... View more
Labels:
- Labels:
-
Apache Kudu
10-06-2019
10:56 PM
Hello All, We are facing issue with one of our clients cluster, we see some files over 2GB, and would like to know the purpose of these files, and if these could be deleted to make way for some space. 1. Eventserver [root@TestBed ~]# du -sh /var/lib/cloudera-scm-eventserver/v3/* |grep G 5.4G /var/lib/cloudera-scm-eventserver/v3/_1bx6l.fdt 1.2G /var/lib/cloudera-scm-eventserver/v3/_1hgcy.fdt 1.4G /var/lib/cloudera-scm-eventserver/v3/_1mz17.fdt 2. Cloudera-Scm-Headlamp [root@TestBed ~]# du -sh /var/lib/cloudera-scm-headlamp/hdfs/nameservice1/index/* |grep G 2.3G /var/lib/cloudera-scm-headlamp/hdfs/nameservice1/index/_2t.fdt 2.2G /var/lib/cloudera-scm-headlamp/hdfs/nameservice1/index/_5m.fdt 2.1G /var/lib/cloudera-scm-headlamp/hdfs/nameservice1/index/_8f.fdt Also is there a way to limit TS data in SMON & HMON, I understand that 10GB is the minimum requirement for these roles but what could be done if a clients has limited resources. Any help / guidance is appreciated Thanks
... View more
Labels:
- Labels:
-
Cloudera Manager
09-04-2019
10:07 PM
Hi Eric, Thanks for your reply, post your reply, I have more questions than answers 🙂 1) Is there a way to find if impala deamon can handle additional connections (from default 64 to 128 and more) 2) Can we find the number of connections that are made in real time? (Apart from Active Frontend API Connections Chart) 3) How are connection closed (i.e. are they closed once an impala query is completed or is there some other process) 4) Any best practice recommendation for Swap Memory setting in Impala. Regards Amn
... View more
08-27-2019
02:20 AM
Hello All, We have been receiving alerts of IMPALAD_FRONTEND_CONNECTIONS becoming bad "There are 0 (Beeswax pool) 55 (Hive Server 2 pool) active client connections, each pool has a configured maximum of 64" Impala Daemon Concurrent Client Connections Monitoring Percentage Thresholds Warning = 80 Critical = 95 Impala Daemon Max Client Connections = 64 CM - 5.16.2 CDH - 5.16.2 Will increasing Impala Daemon Max Client Connections reduce / stop these alerts? Any help is most welcome.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala
-
Cloudera Manager
- « Previous
- Next »