Member since
08-08-2017
1652
Posts
29
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1429 | 06-15-2020 05:23 AM | |
9074 | 01-30-2020 08:04 PM | |
1585 | 07-07-2019 09:06 PM | |
6515 | 01-27-2018 10:17 PM | |
3710 | 12-31-2017 10:12 PM |
08-20-2024
02:37 PM
we have cluster with 12 Kafka machines and 3 zookeeper servers on Linux servers - Kafka version is 2.7 version , ( broker and controller are co-hosted on the same PID ) as known Kafka have 2 important logs and they are **server.log** and **controller.log** about **controller.log** , when we look on this log we can see the following words - "`Shutdown completed`" in the log [2024-08-20 21:42:01,582] INFO [ControllerEventThread controllerId=1001] Shutdown completed (kafka.controller.ControllerEventManager$ControllerEventThread) the first thinking when we see the messages about "`Shutdown completed`" - is like this message is "bad" and why controller stopped ... but when we look on all machines most of the machines have this message as - `Shutdown completed` (`kafka.controller.ControllerEventManager$ControllerEventThread`) **but on other hand** only one controller should be active from all brokers and maybe the messages as "`Shutdown completed`" are only indicate that controllers that are not active are in standby state and therefore are in state of - `Shutdown completed` ? for example - here one of the log from one broker machine [2024-08-20 21:23:18,084] DEBUG [Controller id=1001] Broker 1007 was elected as controller instead of broker 1001 (kafka.controller.KafkaController) org.apache.kafka.common.errors.ControllerMovedException: Controller moved to another broker. Aborting controller startup procedure [2024-08-20 21:33:51,281] DEBUG [Controller id=1001] Broker 1005 was elected as controller instead of broker 1001 (kafka.controller.KafkaController) org.apache.kafka.common.errors.ControllerMovedException: Controller moved to another broker. Aborting controller startup procedure [2024-08-20 21:42:01,581] INFO [ControllerEventThread controllerId=1001] Shutting down (kafka.controller.ControllerEventManager$ControllerEventThread) [2024-08-20 21:42:01,582] INFO [ControllerEventThread controllerId=1001] Shutdown completed (kafka.controller.ControllerEventManager$ControllerEventThread) [2024-08-20 21:42:01,582] INFO [ControllerEventThread controllerId=1001] Stopped (kafka.controller.ControllerEventManager$ControllerEventThread) [2024-08-20 21:42:01,582] DEBUG [Controller id=1001] Resigning (kafka.controller.KafkaController) [2024-08-20 21:42:01,583] DEBUG [Controller id=1001] Unregister BrokerModifications handler for Set() (kafka.controller.KafkaController) [2024-08-20 21:42:01,604] INFO [PartitionStateMachine controllerId=1001] Stopped partition state machine (kafka.controller.ZkPartitionStateMachine) [2024-08-20 21:42:01,608] INFO [ReplicaStateMachine controllerId=1001] Stopped replica state machine (kafka.controller.ZkReplicaStateMachine) [2024-08-20 21:42:01,608] INFO [Controller id=1001] Resigned (kafka.controller.KafkaController) [2024-08-20 21:43:45,196] INFO [ControllerEventThread controllerId=1001] Starting (kafka.controller.ControllerEventManager$ControllerEventThread) [2024-08-20 21:43:45,208] DEBUG [Controller id=1001] Broker 1005 has been elected as the controller, so stopping the election process. (kafka.controller.KafkaController) [2024-08-20 21:52:28,400] DEBUG [Controller id=1001] Broker 1001 was elected as controller instead of broker 1001 (kafka.controller.KafkaController) org.apache.kafka.common.errors.ControllerMovedException: Controller moved to another broker. Aborting controller startup procedure <---- LOG IS ENDED HERE so the question is - can we ignore the messages as `INFO [ControllerEventThread controllerId=1001] Shutdown completed (kafka.controller.ControllerEventManager$ControllerEventThread)` of maybe something is wrong with the Kafka controller ?
... View more
Labels:
- Labels:
-
Apache Kafka
04-21-2024
11:30 PM
Hi, @mike_bronson7 did you try to delete the host from Ambari UI? if not please try that out first. You might need to stop the ambari server first delete entries from the backed DB commit the changes then start your ambari server. But I would recommend deleting if from Ambari UI and checking.
... View more
04-17-2024
03:10 PM
1 Kudo
I was looking for the same info and found that great link below. https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html I hope it can help you. Best,
... View more
04-16-2024
11:59 PM
1 Kudo
Hi @mike_bronson7 Please check that if the RM is working fine or is it down. Please also check your zookeeper. Please check with telnet if you can connect to RMhost from other host.
... View more
03-20-2024
11:46 PM
1 Kudo
thank you for response but look on that also 2024-03-18 19:31:52,673 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:756ms (threshold=300ms), volume=/data/sde/hadoop/hdfs/data 2024-03-18 19:35:15,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:377ms (threshold=300ms), volume=/data/sdc/hadoop/hdfs/data 2024-03-18 19:51:57,774 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:375ms (threshold=300ms), volume=/data/sdb/hadoop/hdfs/data As you can see the warning is also on local disks not only across the network In any case we already checked the network include the switches and we not found a problem Do you think its could be tuning issue in hdfs parameters or some parameters that can help
... View more
02-27-2024
12:29 AM
2 Kudos
Yes @mike_bronson7 above steps also works
... View more
02-07-2024
01:13 AM
1 Kudo
hi, when we add balancer role then in instance tab its showing N/A. I think its not starting as expected. We are using cloudera express version. its a production cluster.
... View more
02-05-2024
05:21 AM
1 Kudo
=> If above steps still gives you issues then you can simply execute step 5 or below Cmd from Standby NN // Bootstrap Standby NameNode. This command copies the contents of the Active NameNode's metadata directories (including the namespace information and most recent checkpoint) to the Standby NameNode. # hdfs namenode -bootstrapStandby Note: Step 1 to step 3 is process of creating new fsimage but if your Active NN is already up and running then I would directly login in to Standby and then perform bootstrapStandby operation
... View more
11-12-2023
07:47 PM
fix: `componentname` should be `component_name`
... View more
02-26-2023
10:29 PM
Hello Mike Likely your existing 3 Zookeeper nodes can serve your expansion requirements You can monitor the CPU and network of the Zookeeper nodes when your Kafka cluster is growing, when reaching the throughput limit, you can expand your zookeeper to 5 nodes Remember the zookeeper nodes need to keep in sync all the time, so the more zookeeper nodes the more traffic will be added to keep them in sync, while those nodes handling the Kafka requests; so it doesn't mean the more the better I would suggest to stay with 3 zookeeper nodes while expanding your kafka cluster with close monitoring, and consider to grow to 5 when the CPU/network throughput reaching the limit You can also consider to tune the zookeeper nodes e.g. dedicated disks, better network throughput, isolate zookeeper process, disable swaps
... View more