About mike_bronson7

mike_bronson7 · ‎08-20-2024

we have cluster with 12 Kafka machines and 3 zookeeper servers on Linux servers - Kafka version is 2.7 version , ( broker and controller are co-hosted on the same PID ) as known Kafka have 2 important logs and they are **server.log** and **controller.log** about **controller.log** , when we look on this log we can see the following words - "`Shutdown completed`" in the log [2024-08-20 21:42:01,582] INFO [ControllerEventThread controllerId=1001] Shutdown completed (kafka.controller.ControllerEventManager$ControllerEventThread) the first thinking when we see the messages about "`Shutdown completed`" - is like this message is "bad" and why controller stopped ... but when we look on all machines most of the machines have this message as - `Shutdown completed` (`kafka.controller.ControllerEventManager$ControllerEventThread`) **but on other hand** only one controller should be active from all brokers and maybe the messages as "`Shutdown completed`" are only indicate that controllers that are not active are in standby state and therefore are in state of - `Shutdown completed` ? for example - here one of the log from one broker machine [2024-08-20 21:23:18,084] DEBUG [Controller id=1001] Broker 1007 was elected as controller instead of broker 1001 (kafka.controller.KafkaController) org.apache.kafka.common.errors.ControllerMovedException: Controller moved to another broker. Aborting controller startup procedure [2024-08-20 21:33:51,281] DEBUG [Controller id=1001] Broker 1005 was elected as controller instead of broker 1001 (kafka.controller.KafkaController) org.apache.kafka.common.errors.ControllerMovedException: Controller moved to another broker. Aborting controller startup procedure [2024-08-20 21:42:01,581] INFO [ControllerEventThread controllerId=1001] Shutting down (kafka.controller.ControllerEventManager$ControllerEventThread) [2024-08-20 21:42:01,582] INFO [ControllerEventThread controllerId=1001] Shutdown completed (kafka.controller.ControllerEventManager$ControllerEventThread) [2024-08-20 21:42:01,582] INFO [ControllerEventThread controllerId=1001] Stopped (kafka.controller.ControllerEventManager$ControllerEventThread) [2024-08-20 21:42:01,582] DEBUG [Controller id=1001] Resigning (kafka.controller.KafkaController) [2024-08-20 21:42:01,583] DEBUG [Controller id=1001] Unregister BrokerModifications handler for Set() (kafka.controller.KafkaController) [2024-08-20 21:42:01,604] INFO [PartitionStateMachine controllerId=1001] Stopped partition state machine (kafka.controller.ZkPartitionStateMachine) [2024-08-20 21:42:01,608] INFO [ReplicaStateMachine controllerId=1001] Stopped replica state machine (kafka.controller.ZkReplicaStateMachine) [2024-08-20 21:42:01,608] INFO [Controller id=1001] Resigned (kafka.controller.KafkaController) [2024-08-20 21:43:45,196] INFO [ControllerEventThread controllerId=1001] Starting (kafka.controller.ControllerEventManager$ControllerEventThread) [2024-08-20 21:43:45,208] DEBUG [Controller id=1001] Broker 1005 has been elected as the controller, so stopping the election process. (kafka.controller.KafkaController) [2024-08-20 21:52:28,400] DEBUG [Controller id=1001] Broker 1001 was elected as controller instead of broker 1001 (kafka.controller.KafkaController) org.apache.kafka.common.errors.ControllerMovedException: Controller moved to another broker. Aborting controller startup procedure <---- LOG IS ENDED HERE so the question is - can we ignore the messages as `INFO [ControllerEventThread controllerId=1001] Shutdown completed (kafka.controller.ControllerEventManager$ControllerEventThread)` of maybe something is wrong with the Kafka controller ?

mike_bronson7 · ‎03-20-2024

thank you for response but look on that also 2024-03-18 19:31:52,673 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:756ms (threshold=300ms), volume=/data/sde/hadoop/hdfs/data 2024-03-18 19:35:15,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:377ms (threshold=300ms), volume=/data/sdc/hadoop/hdfs/data 2024-03-18 19:51:57,774 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:375ms (threshold=300ms), volume=/data/sdb/hadoop/hdfs/data As you can see the warning is also on local disks not only across the network In any case we already checked the network include the switches and we not found a problem Do you think its could be tuning issue in hdfs parameters or some parameters that can help

mike_bronson7 · ‎03-19-2024

We have Hadoop cluster with `487` data-nodes machines ( each data-node machine include also the Service node-manager ) , all machines are physical machines ( DELL ) , and OS is RHEL 7.9 version. Each data-node machine have 12 disks, each disk is with size of 12T Hadoop cluster type installed from HDP packages ( previously was under Horton-works and now under Cloudera ) Users are complain about slowness from spark applications that run on data-nodes machines And after investigation we seen the following warning from data-node logs 2024-03-18 17:41:30,230 WARN datanode.DataNode (BlockReceiver.java:receivePacket(567)) - Slow BlockReceiver write packet to mirror took 401ms (threshold=300ms), downstream DNs=[172.87.171.24:50010, 172.87.171.23:50010] 2024-03-18 17:41:49,795 WARN datanode.DataNode (BlockReceiver.java:receivePacket(567)) - Slow BlockReceiver write packet to mirror took 410ms (threshold=300ms), downstream DNs=[172.87.171.26:50010, 172.87.171.31:50010] 2024-03-18 18:06:29,585 WARN datanode.DataNode (BlockReceiver.java:receivePacket(567)) - Slow BlockReceiver write packet to mirror took 303ms (threshold=300ms), downstream DNs=[172.87.171.34:50010, 172.87.171.22:50010] 2024-03-18 18:18:55,931 WARN datanode.DataNode (BlockReceiver.java:receivePacket(567)) - Slow BlockReceiver write packet to mirror took 729ms (threshold=300ms), downstream DNs=[172.87.11.27:50010] from above log we can see the `warning Slow BlockReceiver write packet to mirror took xxms` and also the data-nodes machines as `172.87.171.23,172.87.171.24` etc. from my understanding the exceptions as Slow `BlockReceiver write packet to mirror` indicate maybe on delay in writing the block to OS cache or disk So I am trying to collect the reasons for this warning / exceptions , and here there are 1. delay in writing the block to OS cache or disk 2. cluster is as or near its resources limit ( memory , CPU or disk ) 3. network issues between machines From my verification I not see **disk** or **CPU** or **memory** problem , we checked all machines From network side I not see special issues that relevant to machines itself And we also used the iperf3 ro check the Bandwidth between one machine to other. here is example between `data-node01` to `data-node03` ( from my understanding and please Correct me if I am wrong looks like Bandwidth is ok ) From data-node01 iperf3 -i 10 -s [ ID] Interval Transfer Bandwidth [ 5] 0.00-10.00 sec 7.90 GBytes 6.78 Gbits/sec [ 5] 10.00-20.00 sec 8.21 GBytes 7.05 Gbits/sec [ 5] 20.00-30.00 sec 7.25 GBytes 6.23 Gbits/sec [ 5] 30.00-40.00 sec 7.16 GBytes 6.15 Gbits/sec [ 5] 40.00-50.00 sec 7.08 GBytes 6.08 Gbits/sec [ 5] 50.00-60.00 sec 6.27 GBytes 5.39 Gbits/sec [ 5] 60.00-60.04 sec 35.4 MBytes 7.51 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth [ 5] 0.00-60.04 sec 0.00 Bytes 0.00 bits/sec sender [ 5] 0.00-60.04 sec 43.9 GBytes 6.28 Gbits/sec receiver From data-node03 iperf3 -i 1 -t 60 -c 172.87.171.84 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 792 MBytes 6.64 Gbits/sec 0 3.02 MBytes [ 4] 1.00-2.00 sec 834 MBytes 6.99 Gbits/sec 54 2.26 MBytes [ 4] 2.00-3.00 sec 960 MBytes 8.05 Gbits/sec 0 2.49 MBytes [ 4] 3.00-4.00 sec 896 MBytes 7.52 Gbits/sec 0 2.62 MBytes [ 4] 4.00-5.00 sec 790 MBytes 6.63 Gbits/sec 0 2.70 MBytes [ 4] 5.00-6.00 sec 838 MBytes 7.03 Gbits/sec 4 1.97 MBytes [ 4] 6.00-7.00 sec 816 MBytes 6.85 Gbits/sec 0 2.17 MBytes [ 4] 7.00-8.00 sec 728 MBytes 6.10 Gbits/sec 0 2.37 MBytes [ 4] 8.00-9.00 sec 692 MBytes 5.81 Gbits/sec 47 1.74 MBytes [ 4] 9.00-10.00 sec 778 MBytes 6.52 Gbits/sec 0 1.91 MBytes [ 4] 10.00-11.00 sec 785 MBytes 6.58 Gbits/sec 48 1.57 MBytes [ 4] 11.00-12.00 sec 861 MBytes 7.23 Gbits/sec 0 1.84 MBytes [ 4] 12.00-13.00 sec 844 MBytes 7.08 Gbits/sec 0 1.96 MBytes Note - Nic card/s are with `10G` speed ( we checked this by ethtool ) We also checked the firmware-version of the NIC card ethtool -i p1p1 driver: i40e version: 2.8.20-k firmware-version: 8.40 0x8000af82 20.5.13 expansion-rom-version: bus-info: 0000:3b:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes We also checked from kernel messages ( `dmesg` ) but no seen something special.

mike_bronson7 · ‎02-21-2024

we have Hadoop cluster with active/stand by resource manager services the active resource manager is on master1 machine and the stand by resource manager is on master2 machine in our cluster YARN service that include both resource manager services is managing 276 node manager component on workers machines from Ambari WEB UI alerts ( Alerts for Resource Manager ) we notice about the following Resource Manager Web UI Connection failed to http://master2.jupiter.com:8088(timed out) we start to debug the issue by wget with port 8088 , and we found that process is hang on - HTTP request sent, `awaiting response... No data received`. example from resource manager machine wget --debug http://master2.jupiter.com:8088 DEBUG output created by Wget 1.14 on Linux-gnu. URI encoding = ‘UTF-8’ Converted file name 'index.html' (UTF-8) -> 'index.html' (UTF-8) Converted file name 'index.html' (UTF-8) -> 'index.html' (UTF-8) --2024-02-21 10:13:42-- http://master2` .jupiter.com:8088/ Resolving master2.jupiter.com (master2.jupiter.com)... 192.9.201.169 Caching master2.jupiter.com => 192.9.201.169 Connecting to master2.jupiter.com (master2.jupiter.com)|192.9.201.169|:8088... connected. Created socket 3. Releasing 0x0000000000a0da00 (new refcount 1). ---request begin--- GET / HTTP/1.1 User-Agent: Wget/1.14 (linux-gnu) Accept: */* Host: master2.jupiter.com:8088 Connection: Keep-Alive ---request end--- HTTP request sent, awaiting response... ---response begin--- HTTP/1.1 307 TEMPORARY_REDIRECT Cache-Control: no-cache Expires: Wed, 21 Feb 2024 10:13:42 GMT Date: Wed, 21 Feb 2024 10:13:42 GMT Pragma: no-cache Expires: Wed, 21 Feb 2024 10:13:42 GMT Date: Wed, 21 Feb 2024 10:13:42 GMT Pragma: no-cache Content-Type: text/plain; charset=UTF-8 X-Frame-Options: SAMEORIGIN Location: http://master1.jupiter.com:8088/ Content-Length: 43 Server: Jetty(6.1.26.hwx) ---response end--- 307 TEMPORARY_REDIRECT Registered socket 3 for persistent reuse. URI content encoding = ‘UTF-8’ Location: http://master1.jupiter.com:8088/ [following] Skipping 43 bytes of body: [This is standby RM. The redirect url is: / ] done. URI content encoding = None Converted file name 'index.html' (UTF-8) -> 'index.html' (UTF-8) Converted file name 'index.html' (UTF-8) -> 'index.html' (UTF-8) --2024-02-21 10:13:42-- http://master1.jupiter.com:8088/ conaddr is: 192.9.201.169 Resolving master1.jupiter.com (master1.jupiter.com)... 192.9.66.14 Caching master1.jupiter.com => 192.9.66.14 Releasing 0x0000000000a0f320 (new refcount 1). Found master1.jupiter.com in host_name_addresses_map (0xa0f320) Connecting to master1.jupiter.com (master1.jupiter.com)|192.9.66.14|:8088... connected. Created socket 4. Releasing 0x0000000000a0f320 (new refcount 1). . . . ---response end--- 302 Found Disabling further reuse of socket 3. Closed fd 3 Registered socket 4 for persistent reuse. URI content encoding = ‘UTF-8’ Location: http://master1.jupiter.com:8088/cluster [following] ] done. URI content encoding = None Converted file name 'index.html' (UTF-8) -> 'index.html' (UTF-8) Converted file name 'index.html' (UTF-8) -> 'index.html' (UTF-8) --2024-02-21 10:27:07-- http://master1.jupiter.com:8088/cluster Reusing existing connection to master1.jupiter.com:8088. Reusing fd 4. ---request begin--- GET /cluster HTTP/1.1 User-Agent: Wget/1.14 (linux-gnu) Accept: */* Host: master1.jupiter.com:8088 Connection: Keep-Alive ---request end--- HTTP request sent, awaiting response... ---response begin--- HTTP/1.1 200 OK Cache-Control: no-cache Expires: Wed, 21 Feb 2024 10:30:23 GMT Date: Wed, 21 Feb 2024 10:30:23 GMT Pragma: no-cache Expires: Wed, 21 Feb 2024 10:30:23 GMT Date: Wed, 21 Feb 2024 10:30:23 GMT Pragma: no-cache Content-Type: text/html; charset=utf-8 X-Frame-Options: SAMEORIGIN Transfer-Encoding: chunked Server: Jetty(6.1.26.hwx) ---response end--- 200 OK URI content encoding = ‘utf-8’ Length: unspecified [text/html] Saving to: ‘index.html’ [ <=> ] 1,018,917 --.-K/s in 0.04s 2024-02-21 10:31:31 (24.0 MB/s) - ‘index.html’ saved [1018917] as we can see above wget completed after very long time around ~ 20 min instead to completed the process in one or two second we can take tcpdump as tcpdump -vv -s0 tcp port 8088 -w /tmp/why_8088_hang.pcap but I want to understand if there are better simple ways to understand why we get HTTP request sent, awaiting response... , and maybe its related to resource manager service

mike_bronson7 · ‎02-15-2024

We have HDP cluster with 152 workers machines - `worker1.duplex.com` .. `worker152.duplex.com` , While all machines are installed on RHEL 7.9 version We are trying to delete the last host - `worker152.duplex.com` from Ambari server or actually from PostgreSQL DB as the following First we need to find the `host_id` select host_id from hosts where host_name='worker152.duplex.com'; and host_id is: host_id --------- 51 (1 row) Now we are deletion this `host_id` - 51 delete from execution_command where task_id in (select task_id from host_role_command where host_id in (51)); delete from host_version where host_id in (51); delete from host_role_command where host_id in (51); delete from serviceconfighosts where host_id in (51); delete from hoststate where host_id in (51); delete from kerberos_principal_host WHERE host_id='worker152.duplex.com'; delete from hosts where host_name in ('worker152.duplex.com'); delete from alert_current where history_id in ( select alert_id from alert_history where host_name in ('worker152.duplex.com')); Now we verify that `host_id` - 51 that represented the host - `worker152.duplex.com` isn't exists By the following verification ambari=> select host_name, public_host_name from hosts; host_name | public_host_name --------------------------+-------------------------- worker1.duplex.com . . . worker151.duplex.com As we can see above the host `worker151.duplex.com` not exist and that's fine , and indeed seems That host - `worker151.duplex.com` was deleted from PostgreSQL DB Now we restarting the `Ambari-server` in order to take affect ( its also restart the PostgreSQL service ) ambari-server restart Using python /usr/bin/python Restarting ambari-server Waiting for server stop... Ambari Server stopped Ambari Server running with administrator privileges. Organizing resource files at /var/lib/ambari-server/resources... Ambari database consistency check started... Server PID at: /var/run/ambari-server/ambari-server.pid Server out at: /var/log/ambari-server/ambari-server.out Server log at: /var/log/ambari-server/ambari-server.log Waiting for server start......................... Server started listening on 8080 DB configs consistency check: no errors and warnings were found. After Ambari server started , we are surprised because the `host_id` - 51 or host - `worker152.duplex.com` , is still exist as the following ambari=> select host_name, public_host_name from hosts; host_name | public_host_name --------------------------+-------------------------- worker1.duplex.com . . . worker152.duplex.com We not understand why this host back again in spite we delete this record We also tried to delete historical data by the following but this isn't help ambari-server db-purge-history --cluster-name hadoop7 --from-date 2024-01-01 Using python /usr/bin/python Purge database history... Ambari Server configured for Embedded Postgres. Confirm you have made a backup of the Ambari Server database [y/n]yes ERROR: The database purge historical data cannot proceed while Ambari Server is running. Please shut down Ambari first. Ambari Server 'db-purge-history' completed successfully. 1. Why host returned after `Ambari-server` restart ? 2. what is wrong with out deletion process? PostgreSQL Version: postgres=# SHOW server_version; server_version ---------------- 9.2.24 (1 row) links: https://www.andruffsolutions.com/removing-old-host-data-from-ambari-server-and-tuning-the-database/ https://community.cloudera.com/t5/Support-Questions/how-to-remove-old-registered-hosts-from-DB/m-p/217524/highlight/true

mike_bronson7 · ‎02-04-2024

you can balance the data-node disks usage by decommission and recommission , but if you have only 2 data-nodes then its a problem better to do it at least 3 data-nodes in cluster

mike_bronson7 · ‎02-04-2024

lets say I copy the fsimage from active to standby namenode and then still we have a problem to start the namenode then can I do the steps as already mentioned?

mike_bronson7 · ‎02-03-2024

we have HDP Hadoop cluster with two name-node services ( one active name-node and the secondary is the standby name-node ) due of unexpected electricity failure , the standby name-node failed to start with the flowing exception , while the active name-node starting successfully 2024-02-02 08:47:11,497 INFO common.Storage (Storage.java:tryLock(776)) - Lock on /hadoop/hdfs/namenode/in_use.lock acquired by nodename 36146@master1.delax.com 2024-02-02 08:47:11,891 INFO namenode.FSImage (FSImage.java:loadFSImageFile(745)) - Planning to load image: FSImageFile(file=/hadoop/hdfs/namenode/current/fsimage_0000000052670667141, cpktTxId=0000000052670667141) 2024-02-02 08:47:11,897 ERROR namenode.FSImage (FSImage.java:loadFSImage(693)) - Failed to load image from FSImageFile(file=/hadoop/hdfs/namenode/current/fsimage_0000000052670667141, cpktTxId=0000000052670667141) java.io.IOException: Premature EOF from inputStream at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:204) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:221) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:898) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:882) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:755) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:686) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1077) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:697) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:761) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1001) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:985) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1710) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1778) 2024-02-02 08:47:12,238 WARN namenode.FSNamesystem (FSNamesystem.java:loadFromDisk(726)) - Encountered exception loading fsimage java.io.IOException: Failed to load FSImage file, see error(s) above for more info. we can see from above exception - `Failed to load image from FSImageFile` , and seems it is as results of when machine failed because unexpected shutdown as I understand one of the options to recover the standby name-node could be with the following procedure: 1. Put Active NN in safemode sudo -u hdfs hdfs dfsadmin -safemode enter 2. Do a savenamespace operation on Active NN sudo -u hdfs hdfs dfsadmin -saveNamespace 3. Leave Safemode sudo -u hdfs hdfs dfsadmin -safemode leave 4. Login to Standby NN 5. Run below command on Standby namenode to get latest fsimage that we saved in above steps. sudo -u hdfs hdfs namenode -bootstrapStandby -force we glad to receive any suggestions , or if my above suggestion is good enough for our problem

mike_bronson7 · ‎02-03-2024

is the following procedure can help also? Put Active NN in safemode sudo -u hdfs hdfs dfsadmin -safemode enter Do a savenamespace operation on Active NN sudo -u hdfs hdfs dfsadmin -saveNamespace Leave Safemode sudo -u hdfs hdfs dfsadmin -safemode leave Login to Standby NN Run below command on Standby namenode to get latest fsimage that we saved in above steps. sudo -u hdfs hdfs namenode -bootstrapStandby -force

mike_bronson7 · ‎02-22-2023

for now we have 15 Kafka machines in the cluster , all machines are are installed with RHEL 7.9 and the HW machine is DELL physical machine Kafka version is 2.7 , and we have 3 zookeeper servers that serve the Kafka cluster we decided to extend the Kafka cluster to ~100 machines , because Total Throughput In Megabytes increased dramatic - note according to Kafka confluent calculator we need around 100 Kafka machines in that case I am wonder if our 3 zookeeper servers are enough to serve this huge cluster machines? addition I want to say that our 3 zookeeper servers are already serve other application as HDFS , YARN , HIVE , spark etc.

Online	Offline
Last Visited	‎08-27-2024 09:17 AM

Member Since	‎08-08-2017 09:40 AM
Last Visited	‎08-27-2024 09:17 AM
Posts	1,652
Kudos received	29

Cloudera Community

Re: how to find number of CPU core on datanode ma...

Re: postgresql + ambari server failed to open port...

Re: how to stop the thrift servers by REST API

Re: namenode is in safe mode

Re: Directory /grid/sdg/hadoop/hdfs/data became un...

kafka + undertsanding better the kafka controller....

Re: Hadoop + warnings as slow block-receive from d...

Hadoop + warnings as slow block-receive from data-...

YARN resource manager "HTTP request sent, awaiting...

PostgreSQL + Removing old Host from DB not succeed...

Re: hdfs balancer is not working

Re: hadoop cluster + Unable to start standby Namen...

hadoop cluster + Unable to start standby Namenode

Re: hadoop cluster with active standby namenode + ...

how many zookeeper servers need to serve ~100 Kafk...