we have ambari cluster version 2.6.1 and HDP version - 2.6.4
while kafka installed on 3 physical machines
we have a strange problem
we update the kafka configuration , and after update kafka required restart
so we start to restart the kafka ( restart from ambari gui ) , but actually kafka progress not start or start as 15% from progress
from server.log ( under /var/kafka/log ) we not see errors and actually we not see progress
we try also to restart the ambari agent on all kafka ,
and restart the kafka again but this action not help and kafka failed to restart after 30 min
kafka brokers are up but does not react to restart action from ambari
any suggestion what could be the reason for this strange behaver
What config did you change? If its custom did you use the Custom kafka-broker config? It's usually a good idea to give as much info as possible, it's easier to act otherwise you will exchange postings without any useful outcome.
we change only the log_retention_hours to 24 H , but this isnt problem of value or parameter that isnt right
It will be good to first isolate if the issue is from Ambari Side or from Kafka side. Which can be done by tring to start the kafka broker manually using command line to see if it starts fine or not?
# su $KAFKA_USER # /usr/hdp/current/kafka-broker/bin/kafka start
If kafka broker starts successfully this way then there might be some issue in the way we are starting kafka from Amabri. We will have to look into the "ambari-server.log" or "ambari-agent.log" of that time also looking at the Ambari UI operation log of kafka startup operation will be useful. May be restarting ambari agent.
If kafka is not even starting from command line then we will have to review the kafka broker logs. Can you also verify if the property changes are present on the kafka config on the individual host like:
# grep 'log.retention.hours' /etc/kafka/conf/* /etc/kafka/conf/server.properties:log.retention.hours=24
Jay , what we can see from the kafka.err is that
/usr/hdp/current/kafka-broker/bin/kafka: line 180: kill: (55636) - Operation not permitted
@Jay , I will explain what was the issue
we try to stop the kafka brokers also from the CLI as
Stopping Kafka  failed.
so we not have a choice and we kill the process by kill -9
then start the kafka broker
as you know the script not use the "-9" , ( from the script its only kill <PID> )
so we need to check why need aggressive kill ( as kill -9 ) ,
This is how to set the parameters (in decreasing order of priority) that you can set in your Kafka broker properties file:
# Configures retention time in milliseconds log.retention.ms=1680000 # Used if log.retention.ms is not set log.retention.minutes=1680 # Used if log.retention.minutes is not set log.retention.hours=168
Also see jay's comments !!